Send to

Choose Destination
See comment in PubMed Commons below
J Integr Bioinform. 2011 Oct 10;8(2):184. doi: 10.2390/biecoll-jib-2011-184.

Automatic extraction of microorganisms and their habitats from free text using text mining workflows.

Author information

National Centre for Text Mining, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK.


In this paper we illustrate the usage of text mining workflows to automatically extract instances of microorganisms and their habitats from free text; these entries can then be curated and added to different databases. To this end, we use a Conditional Random Field (CRF) based classifier, as part of the workflows, to extract the mention of microorganisms, habitats and the inter-relation between organisms and their habitats. Results indicate a good performance for extraction of microorganisms and the relation extraction aspects of the task (with a precision of over 80%), while habitat recognition is only moderate (a precision of about 65%). We also conjecture that pdf-to-text conversion can be quite noisy and this implicitly affects any sentence-based relation extraction algorithms.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center