Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2006 Nov 24;7 Suppl 3:S3.

An environment for relation mining over richly annotated corpora: the case of GENIA.

Author information

  • 1Institute of Computational Linguistics, IFI, University of Zurich, Switzerland. rinaldi@ifi.unizh.ch



The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important information.


We describe and evaluate an environment supporting the extraction of domain-specific relations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syntactic alternations, plus semantic ontology information.


The experiments show that our approach described is capable of delivering high-precision results, while maintaining sufficient levels of recall. The high level of abstraction of the rules used by the system, which are considerably more powerful and versatile than finite-state approaches, allows speedy interactive development and validation.

[PubMed - indexed for MEDLINE]
Free PMC Article

Images from this publication.See all images (2)Free text

Figure 1
Figure 2
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Write to the Help Desk