Send to

Choose Destination
See comment in PubMed Commons below
Proc AMIA Symp. 2002:160-4.

A comparison of the Charlson comorbidities derived from medical language processing and administrative data.

Author information

  • 1Department of Medical Informatics, Columbia University, New York, NY, USA.


The objective of this study was to develop a medical language processing (MLP) system, which consisted of MedLEE and a set of inference rules, to identify 19 Charlson comorbidities from discharge summaries and chest x-ray reports. We used 233 cases to learn the patterns that were indicative of comorbidities for developing the inference rules. We then used an independent data set of 3,662 pneumonia patients to identify comorbidities by MLP compared with administrative data (ICD-9 codes). A stratified random sample of 190 records from disagreement cases was manually reviewed. The sensitivity, specificity, and accuracy for the MLP system/ICD-9 codes in this testing set were 0.84/0.16, 0.70/0.30, and 0.77/0.23 respectively. Thirteen of the 19 comorbidities studied were underreported in the administrative data. The kappa values ranged from 0.19 for peptic ulcer to 0.70 for lymphoma. We conclude that comorbidities derived from natural language processing of medical records can improve ICD-9-based approaches.

[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PubMed Central
    Loading ...
    Support Center