Format

Send to

Choose Destination
See comment in PubMed Commons below
BMC Med Inform Decis Mak. 2013 Mar 2;13:30. doi: 10.1186/1472-6947-13-30.

Improving sensitivity of machine learning methods for automated case identification from free-text electronic medical records.

Author information

1
Department of Medical Informatics, Erasmus Medical Center, P,O, Box 2040, Rotterdam 3000CA, Netherlands. m.afzal@erasmusmc.nl

Abstract

BACKGROUND:

Distinguishing cases from non-cases in free-text electronic medical records is an important initial step in observational epidemiological studies, but manual record validation is time-consuming and cumbersome. We compared different approaches to develop an automatic case identification system with high sensitivity to assist manual annotators.

METHODS:

We used four different machine-learning algorithms to build case identification systems for two data sets, one comprising hepatobiliary disease patients, the other acute renal failure patients. To improve the sensitivity of the systems, we varied the imbalance ratio between positive cases and negative cases using under- and over-sampling techniques, and applied cost-sensitive learning with various misclassification costs.

RESULTS:

For the hepatobiliary data set, we obtained a high sensitivity of 0.95 (on a par with manual annotators, as compared to 0.91 for a baseline classifier) with specificity 0.56. For the acute renal failure data set, sensitivity increased from 0.69 to 0.89, with specificity 0.59. Performance differences between the various machine-learning algorithms were not large. Classifiers performed best when trained on data sets with imbalance ratio below 10.

CONCLUSIONS:

We were able to achieve high sensitivity with moderate specificity for automatic case identification on two data sets of electronic medical records. Such a high-sensitive case identification system can be used as a pre-filter to significantly reduce the burden of manual record validation.

PMID:
23452306
PMCID:
PMC3602667
DOI:
10.1186/1472-6947-13-30
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Support Center