Display Settings:


Send to:

Choose Destination
Ann Intern Med. 2012 Jan 3;156(1 Pt 1):11-8. doi: 10.7326/0003-4819-156-1-201201030-00003.

Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes.

Author information

  • 1Mount Sinai School of Medicine, New York, New York, USA. ontolimatics@gmail.com



An effective national biosurveillance system expedites outbreak recognition and facilitates response coordination at the federal, state, and local levels. The BioSense system, used at the Centers for Disease Control and Prevention, incorporates chief complaints but not data from the whole encounter note into its surveillance algorithms.


To evaluate whether biosurveillance by using data from the whole encounter note is superior to that using data from the chief complaint field alone.


6-year retrospective case-control cohort study.


Mayo Clinic, Rochester, Minnesota.


17,243 persons tested for influenza A or B virus between 1 January 2000 and 31 December 2006.


The accuracy of a model based on signs and symptoms to predict influenza virus infection in patients with upper respiratory tract symptoms, and the ability of a natural language processing technique to identify definitional clinical features from free-text encounter notes.


Surveillance based on the whole encounter note was superior to the chief complaint field alone. For the case definition used by surveillance of the whole encounter note, the normalized partial area under the receiver-operating characteristic curve (specificity, 0.1 to 0.4) for surveillance using the whole encounter note was 92.9% versus 70.3% for surveillance with the chief complaint field (difference, 22.6%; P < 0.001). Comparison of the 2 models at the fixed specificity of 0.4 resulted in sensitivities of 89.0% and 74.4%, respectively (P < 0.001). The relative risk for missing a true case of influenza was 2.3 by using the chief complaint field model.


Participants were seen at 1 tertiary referral center. The cost of comprehensive biosurveillance monitoring was not studied.


A biosurveillance model for influenza using the whole encounter note is more accurate than a model that uses only the chief complaint field. Because case-defining signs and symptoms of influenza are commonly available in health records, the investigators believe that the national strategy for biosurveillance should be changed to incorporate data from the whole health record.


Centers for Disease Control and Prevention.

Comment in

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Icon for Silverchair Information Systems
    Loading ...
    Write to the Help Desk