Format

Send to

Choose Destination
See comment in PubMed Commons below
J Biomed Inform. 2012 Dec;45(6):1075-83. doi: 10.1016/j.jbi.2012.06.003. Epub 2012 Jun 25.

A new clustering method for detecting rare senses of abbreviations in clinical notes.

Author information

  • 1Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37203, USA. hua.xu@vanderbilt.edu

Abstract

Abbreviations are widely used in clinical documents and they are often ambiguous. Building a list of possible senses (also called sense inventory) for each ambiguous abbreviation is the first step to automatically identify correct meanings of abbreviations in given contexts. Clustering based methods have been used to detect senses of abbreviations from a clinical corpus [1]. However, rare senses remain challenging and existing algorithms are not good enough to detect them. In this study, we developed a new two-phase clustering algorithm called Tight Clustering for Rare Senses (TCRS) and applied it to sense generation of abbreviations in clinical text. Using manually annotated sense inventories from a set of 13 ambiguous clinical abbreviations, we evaluated and compared TCRS with the existing Expectation Maximization (EM) clustering algorithm for sense generation, at two different levels of annotation cost (10 vs. 20 instances for each abbreviation). Our results showed that the TCRS-based method could detect 85% senses on average; while the EM-based method found only 75% senses, when similar annotation effort (about 20 instances) was used. Further analysis demonstrated that the improvement by the TCRS method was mainly from additionally detected rare senses, thus indicating its usefulness for building more complete sense inventories of clinical abbreviations.

Copyright © 2012 Elsevier Inc. All rights reserved.

[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science Icon for PubMed Central
    Loading ...
    Write to the Help Desk