Format

Send to

Choose Destination
BMC Bioinformatics. 2016 Aug 17;17(1):307. doi: 10.1186/s12859-016-1165-8.

A machine learning strategy for predicting localization of post-translational modification sites in protein-protein interacting regions.

Author information

1
Systems Biology Center, Research Affairs, Faculty of Medicine, Chulalongkorn University, 1873 Rama 4 Road, Pathumwan, Bangkok, 10330, Thailand.
2
Department of Medicine, Division of Nephrology, Faculty of Medicine, Chulalongkorn University, 1873 Rama 4 Road, Pathumwan, Bangkok, 10330, Thailand. fmedyah@md.chula.ac.th.
3
Systems Biology Center, Research Affairs, Faculty of Medicine, Chulalongkorn University, 1873 Rama 4 Road, Pathumwan, Bangkok, 10330, Thailand. pisitkut@nhlbi.nih.gov.
4
Epithelial Systems Biology Laboratory, NHLBI, National Institutes of Health, Bethesda, MD, 20892-1603, USA. pisitkut@nhlbi.nih.gov.

Abstract

BACKGROUND:

One very important functional domain of proteins is the protein-protein interacting region (PPIR), which forms the binding interface between interacting polypeptide chains. Post-translational modifications (PTMs) that occur in the PPIR can either interfere with or facilitate the interaction between proteins. The ability to predict whether sites of protein modifications are inside or outside of PPIRs would be useful in further elucidating the regulatory mechanisms by which modifications of specific proteins regulate their cellular functions.

RESULTS:

Using two of the comprehensive databases for protein-protein interaction and protein modification site data (PDB and PhosphoSitePlus, respectively), we created new databases that map PTMs to their locations inside or outside of PPIRs. The mapped PTMs represented only 5 % of all known PTMs. Thus, in order to predict localization within or outside of PPIRs for the vast majority of PTMs, a machine learning strategy was used to generate predictive models from these mapped databases. For the three mapped PTM databases which had sufficient numbers of modification sites for generating models (acetylation, phosphorylation, and ubiquitylation), the resulting models yielded high overall predictive performance as judged by a combined performance score (CPS). Among the multiple properties of amino acids that were used in the classification tasks, hydrophobicity was found to contribute substantially to the performance of the final predictive models. Compared to the other classifiers we also evaluated, the SVM provided the best performance overall.

CONCLUSIONS:

These models are the first to predict whether PTMs are located inside or outside of PPIRs, as demonstrated by their high predictive performance. The models and data presented here should be useful in prioritizing both known and newly identified PTMs for further studies to determine the functional relationship between specific PTMs and protein-protein interactions. The implemented R package is available online ( http://sysbio.chula.ac.th/PtmPPIR ).

KEYWORDS:

AAindex; Machine learning; Post-translational modification; Protein-protein interacting region

PMID:
27534850
PMCID:
PMC4989344
DOI:
10.1186/s12859-016-1165-8
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center