Display Settings:

Format

Send to:

Choose Destination
We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
    J Biomed Inform. 2011 Dec;44(6):927-35. doi: 10.1016/j.jbi.2011.06.001. Epub 2011 Jun 12.

    Dynamic categorization of clinical research eligibility criteria by hierarchical clustering.

    Source

    Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States.

    Abstract

    OBJECTIVE:

    To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity.

    DESIGN:

    The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hierarchical clustering and to train supervised classifiers.

    MEASUREMENTS:

    We induced 27 categories and measured the prevalence of the categories in 27,278 eligibility criteria from 1578 clinical trials and compared the classification performance (i.e., precision, recall, and F1-score) between the UMLS-based feature representation and the "bag of words" feature representation among five common classifiers in Weka, including J48, Bayesian Network, Naïve Bayesian, Nearest Neighbor, and instance-based learning classifier.

    RESULTS:

    The UMLS semantic feature representation outperforms the "bag of words" feature representation in 89% of the criteria categories. Using the semantically induced categories, machine-learning classifiers required only 2000 instances to stabilize classification performance. The J48 classifier yielded the best F1-score and the Bayesian Network classifier achieved the best learning efficiency.

    CONCLUSION:

    The UMLS is an effective knowledge source and can enable an efficient feature representation for semi-automated semantic category induction and automatic categorization for clinical research eligibility criteria and possibly other clinical text.

    Copyright © 2011 Elsevier Inc. All rights reserved.

    PMID:
    21689783
    [PubMed - indexed for MEDLINE]
    PMCID:
    PMC3183114
    Free PMC Article

    Images from this publication.See all images (7)Free text

    Figure 2
    Figure 4
    Figure 6
    Figure 1
    Figure 3
    Figure 5
    Figure 7

      Supplemental Content

      Icon for Elsevier Science Icon for PubMed Central

      Save items

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk