Send to:

Choose Destination
See comment in PubMed Commons below
Comp Funct Genomics. 2005;6(1-2):61-6. doi: 10.1002/cfg.451.

Towards a semantic lexicon for biological language processing.

Author information

  • 1Los Alamos National Laboratory, Los Alamos, NM 87545, USA.


This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.

Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Hindawi Publishing Corporation Icon for PubMed Central
    Loading ...
    Write to the Help Desk