Mutation probability of cytochrome P450 based on a genetic algorithm and support vector machine

Biotechnol J. 2011 Nov;6(11):1367-76. doi: 10.1002/biot.201000450. Epub 2011 Jul 1.

Abstract

The support vector machine (SVM), an effective statistical learning method, has been widely used in mutation prediction. Two factors, i.e., feature selection and parameter setting, have shown great influence on the efficiency and accuracy of SVM classification. In this study, according to the principles of a genetic algorithm (GA) and SVM, we developed a GA-SVM program and applied it to human cytochrome P450s (CYP450s), which are important monooxygenases in phase I drug metabolism. The program optimizes features and parameters simultaneously, and hence fewer features are used and the overall prediction accuracy is improved. We focus on the mutation of non-synonymous single nucleotide polymorphisms (nsSNPs) in protein sequences that appear to exhibit significant influences on drug metabolism. The final predictive model has a quite satisfactory performance, with the prediction accuracy of 61% and cross-validation accuracy of 73%. The results indicate that the GA-SVM program is a powerful tool in optimizing mutation predictive models of nsSNPs of human CYP450s.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Chemical Phenomena
  • Cytochrome P-450 Enzyme System / genetics*
  • Cytochrome P-450 Enzyme System / metabolism
  • Humans
  • Logistic Models
  • Models, Theoretical
  • Mutation*
  • Polymorphism, Single Nucleotide
  • Probability
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Support Vector Machine*

Substances

  • Cytochrome P-450 Enzyme System