Format

Send to

Choose Destination
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:610-619. eCollection 2019.

Comparison of Natural Language Processing Techniques in Analysis of Sparse Clinical Data: Insulin Decline by Patients.

Author information

1
Brigham and Women's Hospital.
2
Harvard Medical School; Boston, MA.

Abstract

We present a comparative evaluation of a range of popular Natural Language Processing (NLP) approaches for Information Extraction (IE) in clinical documents to detect cases of patients declining medication that has been recommended by their providers. More specifically, we tackle the task of identifying diabetics who decline insulin, using a training set of 51k randomly selected provider notes. Analysis shows that decline of insulin by patients is a rare phenomenon, with a document-level prevalence of approx. 0.1%. We examine the effectiveness of some of the most popular IE approaches, including sentence-level support vector machines (SVM)-based classification, token- level sequence labelling using conditional random fields (CRFs), and rule-based detection based on encoding human knowledge. Our results on a held-out test set show that the generalization of rule-based approach (F1=0.97) outperforms the SVM (F1=0.61) and CRF models (F1=0.40).

PMID:
31259016
PMCID:
PMC6568116

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center