Format

Send to

Choose Destination
Bioinformatics. 2016 Aug 1;32(15):2281-8. doi: 10.1093/bioinformatics/btw166. Epub 2016 Mar 26.

Genome-scale prediction of moonlighting proteins using diverse protein association information.

Author information

1
Department of Computer Science.
2
Department of Computer Science Department of Biological Science, Purdue University, West Lafayette, IN, USA.

Abstract

MOTIVATION:

Moonlighting proteins (MPs) show multiple cellular functions within a single polypeptide chain. To understand the overall landscape of their functional diversity, it is important to establish a computational method that can identify MPs on a genome scale. Previously, we have systematically characterized MPs using functional and omics-scale information. In this work, we develop a computational prediction model for automatic identification of MPs using a diverse range of protein association information.

RESULTS:

We incorporated a diverse range of protein association information to extract characteristic features of MPs, which range from gene ontology (GO), protein-protein interactions, gene expression, phylogenetic profiles, genetic interactions and network-based graph properties to protein structural properties, i.e. intrinsically disordered regions in the protein chain. Then, we used machine learning classifiers using the broad feature space for predicting MPs. Because many known MPs lack some proteomic features, we developed an imputation technique to fill such missing features. Results on the control dataset show that MPs can be predicted with over 98% accuracy when GO terms are available. Furthermore, using only the omics-based features the method can still identify MPs with over 75% accuracy. Last, we applied the method on three genomes: Saccharomyces cerevisiae, Caenorhabditis elegans and Homo sapiens, and found that about 2-10% of proteins in the genomes are potential MPs.

AVAILABILITY AND IMPLEMENTATION:

Code available at http://kiharalab.org/MPprediction

CONTACT:

dkihara@purdue.edu

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
27153604
PMCID:
PMC4965633
DOI:
10.1093/bioinformatics/btw166
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center