Send to

Choose Destination
J Clin Gastroenterol. 2016 Nov/Dec;50(10):889-894.

Defining a Patient Population With Cirrhosis: An Automated Algorithm With Natural Language Processing.

Author information

Divisions of *Digestive Diseases †General Internal Medicine, David Geffen School of Medicine at UCLA ‡UCLA Office of Health Informatics & Analytics, UCLA Health, Los Angeles, CA.



The objective of this study was to use natural language processing (NLP) as a supplement to International Classification of Diseases, Ninth Revision (ICD-9) and laboratory values in an automated algorithm to better define and risk-stratify patients with cirrhosis.


Identification of patients with cirrhosis by manual data collection is time-intensive and laborious, whereas using ICD-9 codes can be inaccurate. NLP, a novel computerized approach to analyzing electronic free text, has been used to automatically identify patient cohorts with gastrointestinal pathologies such as inflammatory bowel disease. This methodology has not yet been used in cirrhosis.


This retrospective cohort study was conducted at the University of California, Los Angeles Health, an academic medical center. A total of 5343 University of California, Los Angeles primary care patients with ICD-9 codes for chronic liver disease were identified during March 2013 to January 2015. An algorithm incorporating NLP of radiology reports, ICD-9 codes, and laboratory data determined whether these patients had cirrhosis. Of the 5343 patients, 168 patient charts were manually reviewed at random as a gold standard comparison. Positive predictive value (PPV), negative predictive value (NPV), sensitivity, and specificity of the algorithm and each of its steps were calculated.


The algorithm's PPV, NPV, sensitivity, and specificity were 91.78%, 96.84%, 95.71%, and 93.88%, respectively. The NLP portion was the most important component of the algorithm with PPV, NPV, sensitivity, and specificity of 98.44%, 93.27%, 90.00%, and 98.98%, respectively.


NLP is a powerful tool that can be combined with administrative and laboratory data to identify patients with cirrhosis within a population.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Wolters Kluwer
Loading ...
Support Center