Format

Send to

Choose Destination
See comment in PubMed Commons below
J Am Med Inform Assoc. 2010 Jul-Aug;17(4):375-82. doi: 10.1136/jamia.2009.001412.

Evaluation of a generalizable approach to clinical information retrieval using the automated retrieval console (ARC).

Author information

1
Massachusetts Veterans Epidemiology Research and Information Center Cooperative Studies Coordinating Center, VA Boston Healthcare System, Jamaica Plain, Massachusetts 02130, USA. leonard.davolio@va.gov

Abstract

Reducing custom software development effort is an important goal in information retrieval (IR). This study evaluated a generalizable approach involving with no custom software or rules development. The study used documents "consistent with cancer" to evaluate system performance in the domains of colorectal (CRC), prostate (PC), and lung (LC) cancer. Using an end-user-supplied reference set, the automated retrieval console (ARC) iteratively calculated performance of combinations of natural language processing-derived features and supervised classification algorithms. Training and testing involved 10-fold cross-validation for three sets of 500 documents each. Performance metrics included recall, precision, and F-measure. Annotation time for five physicians was also measured. Top performing algorithms had recall, precision, and F-measure values as follows: for CRC, 0.90, 0.92, and 0.89, respectively; for PC, 0.97, 0.95, and 0.94; and for LC, 0.76, 0.80, and 0.75. In all but one case, conditional random fields outperformed maximum entropy-based classifiers. Algorithms had good performance without custom code or rules development, but performance varied by specific application.

PMID:
20595303
PMCID:
PMC2995644
DOI:
10.1136/jamia.2009.001412
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems Icon for PubMed Central
    Loading ...
    Support Center