Send to

Choose Destination
Mol Divers. 2010 Nov;14(4):667-71. doi: 10.1007/s11030-009-9205-1. Epub 2009 Nov 12.

Prediction of subcellular location of mycobacterial protein using feature selection techniques.

Author information

Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, 610054, Chengdu, China.


Mycobacterium tuberculosis is the primary pathogen causing tuberculosis, which is one of the most prevalent infectious diseases. The subcellular location of mycobacterial proteins can provide essential clues for proteins function research and drug discovery. Therefore, it is highly desirable to develop a computational method for fast and reliable prediction of subcellular location of mycobacterial proteins. In this study, we developed a support vector machine (SVM) based method to predict subcellular location of mycobacterial proteins. A total of 444 non-redundant mycobacterial proteins were used to train and test proposed model by using jackknife cross validation. By selecting traditional pseudo amino acid composition (PseAAC) as parameters, the overall accuracy of 83.3% was achieved. Moreover, a feature selection technique was developed to find out an optimal amount of PseAAC for improving predictive performance. The optimal amount of PseAAC improved overall accuracy from 83.3 to 87.2%. In addition, the reduced amino acids in N-terminus and non N-terminus of proteins were combined in models for further improving predictive successful rate. As a result, the maximum overall accuracy of 91.2% was achieved with average accuracy of 79.7%. The proposed model provides highly useful information for further experimental research. The prediction model can be accessed free of charge at

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center