Send to

Choose Destination
Bioinformatics. 2015 Mar 1;31(5):745-52. doi: 10.1093/bioinformatics/btu715. Epub 2014 Oct 29.

Measuring the wisdom of the crowds in network-based gene function inference.

Author information

Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA.



Network-based gene function inference methods have proliferated in recent years, but measurable progress remains elusive. We wished to better explore performance trends by controlling data and algorithm implementation, with a particular focus on the performance of aggregate predictions.


Hypothesizing that popular methods would perform well without hand-tuning, we used well-characterized algorithms to produce verifiably 'untweaked' results. We find that most state-of-the-art machine learning methods obtain 'gold standard' performance as measured in critical assessments in defined tasks. Across a broad range of tests, we see close alignment in algorithm performances after controlling for the underlying data being used. We find that algorithm aggregation provides only modest benefits, with a 17% increase in area under the ROC (AUROC) above the mean AUROC. In contrast, data aggregation gains are enormous with an 88% improvement in mean AUROC. Altogether, we find substantial evidence to support the view that additional algorithm development has little to offer for gene function prediction.


The supplementary information contains a description of the algorithms, the network data parsed from different biological data resources and a guide to the source code (available at:

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center