Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition

J Theor Biol. 2011 Feb 21;271(1):10-7. doi: 10.1016/j.jtbi.2010.11.017. Epub 2010 Nov 24.

Abstract

Membrane proteins are vital type of proteins that serve as channels, receptors, and energy transducers in a cell. Prediction of membrane protein types is an important research area in bioinformatics. Knowledge of membrane protein types provides some valuable information for predicting novel example of the membrane protein types. However, classification of membrane protein types can be both time consuming and susceptible to errors due to the inherent similarity of membrane protein types. In this paper, neural networks based membrane protein type prediction system is proposed. Composite protein sequence representation (CPSR) is used to extract the features of a protein sequence, which includes seven feature sets; amino acid composition, sequence length, 2 gram exchange group frequency, hydrophobic group, electronic group, sum of hydrophobicity, and R-group. Principal component analysis is then employed to reduce the dimensionality of the feature vector. The probabilistic neural network (PNN), generalized regression neural network, and support vector machine (SVM) are used as classifiers. A high success rate of 86.01% is obtained using SVM for the jackknife test. In case of independent dataset test, PNN yields the highest accuracy of 95.73%. These classifiers exhibit improved performance using other performance measures such as sensitivity, specificity, Mathew's correlation coefficient, and F-measure. The experimental results show that the prediction performance of the proposed scheme for classifying membrane protein types is the best reported, so far. This performance improvement may largely be credited to the learning capabilities of neural networks and the composite feature extraction strategy, which exploits seven different properties of protein sequences. The proposed Mem-Predictor can be accessed at http://111.68.99.218/Mem-Predictor.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry*
  • Animals
  • Computational Biology / methods
  • Databases, Protein
  • Hydrophobic and Hydrophilic Interactions
  • Membrane Proteins / chemistry*
  • Neural Networks, Computer
  • Principal Component Analysis
  • Sequence Analysis, Protein / methods*
  • Support Vector Machine

Substances

  • Amino Acids
  • Membrane Proteins