Format

Send to

Choose Destination
See comment in PubMed Commons below
J Integr Bioinform. 2011 Oct 10;8(2):184. doi: 10.2390/biecoll-jib-2011-184.

Automatic extraction of microorganisms and their habitats from free text using text mining workflows.

Author information

1
National Centre for Text Mining, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK. kollurub@cs.man.ac.uk

Abstract

In this paper we illustrate the usage of text mining workflows to automatically extract instances of microorganisms and their habitats from free text; these entries can then be curated and added to different databases. To this end, we use a Conditional Random Field (CRF) based classifier, as part of the workflows, to extract the mention of microorganisms, habitats and the inter-relation between organisms and their habitats. Results indicate a good performance for extraction of microorganisms and the relation extraction aspects of the task (with a precision of over 80%), while habitat recognition is only moderate (a precision of about 65%). We also conjecture that pdf-to-text conversion can be quite noisy and this implicitly affects any sentence-based relation extraction algorithms.

PMID:
21987583
DOI:
10.2390/biecoll-jib-2011-184
[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center