Send to

Choose Destination

Applying natural language processing toolkits to electronic health records - an experience report.

Author information

Department of Computer Science, University of Victoria, Victoria, BC, Canada.


A natural language challenge devised by Informatics for Integrating Biology and the Bedside (i2b2) was to analyze free-text health data to construct a multi-class, multi-label classification system focused on obesity and its co-morbidities. This report presents a case study in which a natural language processing (NLP) toolkit, called NLTK, was used in the challenge. This report provides a brief review of NLP in the context of EHR applications, briefly surveys and contrasts some existing NLP toolkits, and reports on our experiences with the i2b2 case study. Our efforts uncovered issues including the lack of human annotated physician notes for use as NLP training data, differences between conventional free-text and medical notes, and potential hardware and software limitations affecting future projects.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for IOS Press
Loading ...
Support Center