Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2004 Jan 22;20(2):216-25.

GAPSCORE: finding gene and protein names one word at a time.

Author information

  • 1Department of Genetics, Stanford Medical Center, 300 Pasteur Drive, Lane L 301, Mail Code 5120, Stanford, CA 94305-5120, USA.

Abstract

MOTIVATION:

New high-throughput technologies have accelerated the accumulation of knowledge about genes and proteins. However, much knowledge is still stored as written natural language text. Therefore, we have developed a new method, GAPSCORE, to identify gene and protein names in text. GAPSCORE scores words based on a statistical model of gene names that quantifies their appearance, morphology and context.

RESULTS:

We evaluated GAPSCORE against the Yapex data set and achieved an F-score of 82.5% (83.3% recall, 81.5% precision) for partial matches and 57.6% (58.5% recall, 56.7% precision) for exact matches. Since the method is statistical, users can choose score cutoffs that adjust the performance according to their needs.

AVAILABILITY:

GAPSCORE is available at http://bionlp.stanford.edu/gapscore/

PMID:
14734313
[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk