Format

Send to

Choose Destination
Int J Mol Sci. 2016 May 18;17(5). pii: E757. doi: 10.3390/ijms17050757.

RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.

Author information

1
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China. ajy@cumt.edu.cn.
2
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China. zhuhongyou@cumt.edu.cn.
3
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China. mengfr@cumt.edu.cn.
4
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China. ajysjm@163.com.
5
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, China. yinwang@cumt.edu.cn.

Abstract

Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.

KEYWORDS:

PSSM; average blocks; protein sequence; relevance vector machine

PMID:
27213337
PMCID:
PMC4881578
DOI:
10.3390/ijms17050757
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Multidisciplinary Digital Publishing Institute (MDPI) Icon for PubMed Central
Loading ...
Support Center