Format

Send to

Choose Destination
PLoS One. 2014 Dec 31;9(12):e115892. doi: 10.1371/journal.pone.0115892. eCollection 2014.

Machine learning for biomedical literature triage.

Author information

1
Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada.
2
Centre for Structural and Functional Genomics, Concordia University, Montreal, QC, Canada.
3
Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada; Centre for Structural and Functional Genomics, Concordia University, Montreal, QC, Canada.

Abstract

This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classification task is obtained by using domain relevant features, an under-sampling technique, and the Logistic Model Trees algorithm.

PMID:
25551575
PMCID:
PMC4281078
DOI:
10.1371/journal.pone.0115892
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center