# koleck2019natural

# Natural language processing of symptoms documented in free-text narratives of electronic health records: A systematic review


Koleck et al. review 27 Natural Language Processing(NLP) systems for EHRs [1], that were found via PubMED and EMBASE after filtering down from an initial 2,553 papers. The scope is defined to include Symptom Science research that focuses on the description, evaluation, or use of an NLP algorithm or pipeline to process or analyze patient symptom terms.

Koleck et al. produce a performance metrics table which lists down the underlying NLP techniques, vocabulary and evaluation metrics used by authors. Koleck et al. further evaluate these research by looking into different aspects shown in Table 2, and give a "quality indicator" to each paper.


Table 1 shows a basic description of 27 papers in this survey.



Miaskowski et al. [2] report in their survey that 83% of 158 surveyed papers are focusing on oncology, the most popular topic in Symptom Science. The authors also found more than 7,000 papers related to oncology during literature research as evidence. However, only 3 papers are focusing on NLP for oncology EHRs in this survey, which leaves a large gap to be filled.

The authors observe that only 33% of surveyed papers report patient demographic characteristics, Corwin et al. [3] point out that such information is essential for future NLP studies, as symptom experience is known to vary by common demographic factors, reporting such information helps avoid potential bias and improve the effectiveness of tailored interventions.

According to the authors, Symptoms are subjective while signs are objective evidence of disease [1:1]. Most of surveyed papers failed to make the distinction between signs and symptoms or inaccurately classifying signs as symptoms.

Unsolved Problems

The authors conclude that there is a lack of true comparative evaluation of the NLP algorithms or pipelines used. An established protocol or platform for sharing NLP algorithms is absent. The fact that small data samples that are available for free and open research is also slowing down the community.

Papers Cited


Years Spanned


Application Domain

Natural Language Processing on EHRs for Symptom Science


  1. Koleck, T. A., Dreisbach, C., Bourne, P. E., & Bakken, S. (2019). Natural language processing of symptoms documented in free-text narratives of electronic health records: A systematic review. Journal of the American Medical Informatics Association, 26(4), 364–379. https://doi.org/10.1093/jamia/ocy173 ↩ī¸Ž ↩ī¸Ž

  2. Miaskowski, C., Barsevick, A., Berger, A., Casagrande, R., Grady, P. A., Jacobsen, P., â€Ļ Marden, S. (2017). Advancing symptom science through symptom cluster research: Expert panel proceedings and recommendations. Journal of the National Cancer Institute. https://doi.org/10.1093/jnci/djw253 ↩ī¸Ž

  3. Corwin, E. J., Berg, J. A., Armstrong, T. S., DeVito Dabbs, A., Lee, K. A., Meek, P., & Redeker, N. (2014). Envisioning the future in symptom science. Nursing Outlook. https://doi.org/10.1016/j.outlook.2014.06.006 ↩ī¸Ž

🔄 Last Updated: 7/8/2020, 9:48:18 PM