Format

Send to

Choose Destination
Proteins. 2008 Apr;71(1):189-94.

Prediction of RNA binding sites in a protein using SVM and PSSM profile.

Author information

1
Bioinformatics Centre, Institute of Microbial Technology, Chandigarh-160036, India.

Abstract

RNA-binding proteins (RBPs) play key roles in post-transcriptional control of gene expression, which, along with transcriptional regulation, is a major way to regulate patterns of gene expression during development. Thus, the identification and prediction of RNA binding sites is an important step in comprehensive understanding of how RBPs control organism development. Combining evolutionary information and support vector machine (SVM), we have developed an improved method for predicting RNA binding sites or RNA interacting residues in a protein sequence. The prediction models developed in this study have been trained and tested on 86 RNA binding protein chains and evaluated using fivefold cross validation technique. First, a SVM model was developed that achieved a maximum Matthew's correlation coefficient (MCC) of 0.31. The performance of this SVM model further improved the MCC from 0.31 to 0.45, when multiple sequence alignment in the form of PSSM profiles was used as input to the SVM, which is far better than the maximum MCC achieved by previous methods (0.41) on the same dataset. In addition, SVM models were also developed on an alternative dataset that contained 107 RBP chains. Utilizing PSSM as input information to the SVM, the training/testing on this alternate dataset achieved a maximum MCC of 0.32. Conclusively, the prediction performance of SVM models developed in this study is better than the existing methods on the same datasets. A web server 'Pprint' was also developed for predicting RNA binding residues in a protein sequence which is freely available at http://www.imtech.res.in/raghava/pprint/.

PMID:
17932917
DOI:
10.1002/prot.21677
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center