Format

Send to:

Choose Destination
See comment in PubMed Commons below
AMIA Annu Symp Proc. 2008 Nov 6:820-4.

Unsupervised method for automatic construction of a disease dictionary from a large free text collection.

Author information

  • 1Center for Biomedical Informatics Research, Stanford University, CA, USA. xurong@stanford.edu

Abstract

Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting con-textual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35-88%) over available, manually created disease terminologies.

PMID:
18999169
[PubMed - indexed for MEDLINE]
PMCID:
PMC2656087
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PubMed Central
    Loading ...
    Write to the Help Desk