Format

Send to

Choose Destination
J Mol Recognit. 2015 Jan;28(1):35-48. doi: 10.1002/jmr.2410.

Prediction of protein-protein interaction sites from weakly homologous template structures using meta-threading and machine learning.

Author information

1
Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
2
Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.

Abstract

The identification of protein-protein interactions is vital for understanding protein function, elucidating interaction mechanisms, and for practical applications in drug discovery. With the exponentially growing protein sequence data, fully automated computational methods that predict interactions between proteins are becoming essential components of system-level function inference. A thorough analysis of protein complex structures demonstrated that binding site locations as well as the interfacial geometry are highly conserved across evolutionarily related proteins. Because the conformational space of protein-protein interactions is highly covered by experimental structures, sensitive protein threading techniques can be used to identify suitable templates for the accurate prediction of interfacial residues. Toward this goal, we developed eFindSite(PPI) , an algorithm that uses the three-dimensional structure of a target protein, evolutionarily remotely related templates and machine learning techniques to predict binding residues. Using crystal structures, the average sensitivity (specificity) of eFindSite(PPI) in interfacial residue prediction is 0.46 (0.92). For weakly homologous protein models, these values only slightly decrease to 0.40-0.43 (0.91-0.92) demonstrating that eFindSite(PPI) performs well not only using experimental data but also tolerates structural imperfections in computer-generated structures. In addition, eFindSite(PPI) detects specific molecular interactions at the interface; for instance, it correctly predicts approximately one half of hydrogen bonds and aromatic interactions, as well as one third of salt bridges and hydrophobic contacts. Comparative benchmarks against several dimer datasets show that eFindSite(PPI) outperforms other methods for protein-binding residue prediction. It also features a carefully tuned confidence estimation system, which is particularly useful in large-scale applications using raw genomic data. eFindSite(PPI) is freely available to the academic community at http://www.brylinski.org/efindsiteppi.

KEYWORDS:

eThread, eFindSitePPI; interfacial site prediction; machine learning; meta-threading; protein models; protein-binding site prediction

PMID:
26268369
DOI:
10.1002/jmr.2410
[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center