Format

Send to

Choose Destination
PLoS One. 2015 Aug 19;10(8):e0134668. doi: 10.1371/journal.pone.0134668. eCollection 2015.

Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.

Author information

1
CoMPLEX, University College London, London, United Kingdom; Institute of Structural and Molecular Biology, University College London, London, United Kingdom.
2
Institute of Structural and Molecular Biology, University College London, London, United Kingdom.
3
Department of Genetics, Evolution and Environment, University College London, London, United Kingdom.
4
Department of Computer Science, University College London, London, United Kingdom.

Abstract

With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene function from these networks have been proposed. However, evaluating the relative performance of these algorithms may not be trivial: concerns have been raised over biases in different benchmarking methods and datasets, particularly relating to non-independence of functional association data and test data. In this paper we propose a new network-based gene function prediction algorithm using a commute-time kernel and partial least squares regression (Compass). We compare Compass to GeneMANIA, a leading network-based prediction algorithm, using a number of different benchmarks, and find that Compass outperforms GeneMANIA on these benchmarks. We also explicitly explore problems associated with the non-independence of functional association data and test data. We find that a benchmark based on the Gene Ontology database, which, directly or indirectly, incorporates information from other databases, may considerably overestimate the performance of algorithms exploiting functional association data for prediction.

PMID:
26288239
PMCID:
PMC4545790
DOI:
10.1371/journal.pone.0134668
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center