• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of procamiasympLink to Publisher's site
Proc AMIA Symp. 1999 : 176–180.
PMCID: PMC2232672

Analysis of biomedical text for chemical names: a comparison of three methods.

Abstract

At the National Library of Medicine (NLM), a variety of biomedical vocabularies are found in data pertinent to its mission. In addition to standard medical terminology, there are specialized vocabularies including that of chemical nomenclature. Normal language tools including the lexically based ones used by the Unified Medical Language System (UMLS) to manipulate and normalize text do not work well on chemical nomenclature. In order to improve NLM's capabilities in chemical text processing, two approaches to the problem of recognizing chemical nomenclature were explored. The first approach was a lexical one and consisted of analyzing text for the presence of a fixed set of chemical segments. The approach was extended with general chemical patterns and also with terms from NLM's indexing vocabulary, MeSH, and the NLM SPECIALIST lexicon. The second approach applied Bayesian classification to n-grams of text via two different methods. The single lexical method and two statistical methods were tested against data from the 1999 UMLS Metathesaurus. One of the statistical methods had an overall classification accuracy of 97%.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (859K), or click on a page image below to browse page by page.

Articles from Proceedings of the AMIA Symposium are provided here courtesy of American Medical Informatics Association

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...