Evaluating the state of the art in disorder recognition and normalization of the clinical narrative

Sameer Pradhan; Noémie Elhadad; Brett R South; David Martinez; Lee Christensen; Amy Vogel; Hanna Suominen; Wendy W Chapman; Guergana Savova

doi:10.1136/amiajnl-2013-002544

Evaluating the state of the art in disorder recognition and normalization of the clinical narrative

J Am Med Inform Assoc. 2015 Jan;22(1):143-54. doi: 10.1136/amiajnl-2013-002544. Epub 2014 Aug 21.

Authors

Sameer Pradhan¹, Noémie Elhadad², Brett R South³, David Martinez⁴, Lee Christensen³, Amy Vogel², Hanna Suominen⁵, Wendy W Chapman³, Guergana Savova¹

Affiliations

¹ Boston Children's Hospital and Harvard Medical School, Boston, Massachusetts, USA.
² Columbia University, New York, New York, USA.
³ University of Utah, Salt Lake City, Utah, USA.
⁴ The University of Melbourne, Australia.
⁵ NICTA, The Australian National University, and University of Canberra, Canberra, Australian Capital Territory, Australia.

Abstract

Objective: The ShARe/CLEF eHealth 2013 Evaluation Lab Task 1 was organized to evaluate the state of the art on the clinical text in (i) disorder mention identification/recognition based on Unified Medical Language System (UMLS) definition (Task 1a) and (ii) disorder mention normalization to an ontology (Task 1b). Such a community evaluation has not been previously executed. Task 1a included a total of 22 system submissions, and Task 1b included 17. Most of the systems employed a combination of rules and machine learners.

Materials and methods: We used a subset of the Shared Annotated Resources (ShARe) corpus of annotated clinical text--199 clinical notes for training and 99 for testing (roughly 180 K words in total). We provided the community with the annotated gold standard training documents to build systems to identify and normalize disorder mentions. The systems were tested on a held-out gold standard test set to measure their performance.

Results: For Task 1a, the best-performing system achieved an F1 score of 0.75 (0.80 precision; 0.71 recall). For Task 1b, another system performed best with an accuracy of 0.59.

Discussion: Most of the participating systems used a hybrid approach by supplementing machine-learning algorithms with features generated by rules and gazetteers created from the training data and from external resources.

Conclusions: The task of disorder normalization is more challenging than that of identification. The ShARe corpus is available to the community as a reference standard for future studies.

Keywords: Clinical Notes; Disorder Identifciation; Information Extraction; Named Entity Recognition; Natural Language Processing; Word Sense Disambiguation.

Publication types

Evaluation Study
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Biological Ontologies
Datasets as Topic
Disease*
Electronic Health Records*
Humans
Information Storage and Retrieval / methods
Natural Language Processing*
Systematized Nomenclature of Medicine
Unified Medical Language System
Vocabulary, Controlled*

Abstract

Publication types

MeSH terms

Grants and funding