Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2006 Nov 24;7 Suppl 3:S2.

Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approaches.

Author information

  • 1Turku Centre for Computer Science (TUCS) and University of Turku, Lemmink√§isenkatu 14 A, 20520 Turku, Finland. sampo.pyysalo@it.utu.fi

Abstract

BACKGROUND:

We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for its effect on parsing performance and consider combinations of these approaches.

RESULTS:

In addition to a 45% increase in parsing efficiency, we find that the best approach, incorporating information from a domain part-of-speech tagger, offers a statistically significant 10% relative decrease in error.

CONCLUSION:

When available, a high-quality domain part-of-speech tagger is the best solution to unknown word issues in the domain adaptation of a general parser. In the absence of such a resource, surface clues can provide remarkably good coverage and performance when tuned to the domain. The adapted parser is available under an open-source license.

PMID:
17134475
[PubMed - indexed for MEDLINE]
PMCID:
PMC1764446
Free PMC Article

Images from this publication.See all images (2)Free text

Figure 1
Figure 2
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Write to the Help Desk