Send to

Choose Destination
Database (Oxford). 2017 Jan 1;2017. doi: 10.1093/database/bax030.

Improving biocuration of microRNAs in diseases: a case study in idiopathic pulmonary fibrosis.

Author information

Facultad de Ciencias, Departamento Biología Celular, Universidad Nacional Autónoma de México, Ciudad Universitaria, Circuito Exterior s/n, Coyoacán, CP 04510, Ciudad de México, CDMX, México.
CONACYT-INER Ismael Cosío Villegas, Departamento Investigación, Calzada de Tlalpan 4502 Sección XVI, Tlalpan, CP Ciudad de México, CDMX, México.
Swiss Institute of Bioinformatics and Institute of Computational Linguistics, University of Zurich, Andreasstrasse 15, CH-8050 Zurich, Switzerland.
Center for Genomics Sciences, Computational Genomics Program, Universidad Nacional Autónoma de México, Av. Universidad s/n, Chamilpa, CP 62210, Cuernavaca, Morelos, México.
Instituto Nacional de Enfermedades Respiratorias Ismael Cosío Villegas, Dirección de Investigación Calzada de Tlalpan 4502 Sección XVI, Tlalpan, CP Ciudad de México, CDMX, México.


MicroRNAs (miRNAs) are small and non-coding RNA molecules that inhibit gene expression posttranscriptionally. They play important roles in several biological processes, and in recent years there has been an interest in studying how they are related to the pathogenesis of diseases. Although there are already some databases that contain information for miRNAs and their relation with illnesses, their curation represents a significant challenge due to the amount of information that is being generated every day. In particular, respiratory diseases are poorly documented in databases, despite the fact that they are of increasing concern regarding morbidity, mortality and economic impacts. In this work, we present the results that we obtained in the BioCreative Interactive Track (IAT), using a semiautomatic approach for improving biocuration of miRNAs related to diseases. Our procedures will be useful to complement databases that contain this type of information. We adapted the OntoGene text mining pipeline and the ODIN curation system in a full-text corpus of scientific publications concerning one specific respiratory disease: idiopathic pulmonary fibrosis, the most common and aggressive of the idiopathic interstitial cases of pneumonia. We curated 823 miRNA text snippets and found a total of 246 miRNAs related to this disease based on our semiautomatic approach with the system OntoGene/ODIN. The biocuration throughput improved by a factor of 12 compared with traditional manual biocuration. A significant advantage of our semiautomatic pipeline is that it can be applied to obtain the miRNAs of all the respiratory diseases and offers the possibility to be used for other illnesses.

Database URL:

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center