Display Settings:

Format

Send to:

Choose Destination
    BMC Bioinformatics. 2009 Dec 3;10 Suppl 15:S7.

    BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature.

    Source

    Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Republic of China. clarkkuo@iis.sinica.edu.tw

    Abstract

    BACKGROUND:

    To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools.

    RESULTS:

    Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed abstracts which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems.

    CONCLUSION:

    By applying our system to extract all short form-long form pairs from all available PubMed abstracts, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/.

    PMID:
    19958517
    [PubMed - indexed for MEDLINE]
    PMCID: PMC2788358
    Free PMC Article

    Images from this publication.See all images (3) Free text

    Figure 2
    Figure 1
    Figure 3

      Supplemental Content

      Click here to read Click here to read

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk