Format

Send to

Choose Destination
Mol Inform. 2012 Jan;31(1):53-62. doi: 10.1002/minf.201100052. Epub 2012 Jan 2.

Classification Models for Predicting Cytochrome P450 Enzyme-Substrate Selectivity.

Author information

1
State Key Laboratory of Microbial Metabolism and College of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, P. R. China, 200240 phone/fax: (021)-34204573.
2
School of Biomedical Engineering, Tian Jin Medical University, Tianjin, P. R. China, 300070.
3
Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
4
Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, GU2 7XH, UK.
5
State Key Laboratory of Microbial Metabolism and College of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, P. R. China, 200240 phone/fax: (021)-34204573. dqwei@sjtu.edu.cn.

Abstract

Cytochrome P450 (CYP) is an important drug-metabolizing enzyme family. Different CYPs often have different substrate preferences. In addition, one drug molecule may be preferentially metabolized by one or more CYP enzymes. Therefore, the classification and prediction of substrate specificity of CYP enzymes are of importance to the understanding of drug metabolisms and may help guide the development of new drugs. In this study, we used three different machine learning methods to classify CYP substrates for predicting CYP-substrate specificity based solely on structural and physicochemical properties of the substrates. We first built a simple decision tree model to classify substrates of four CYP enzymes, 1A2, 2C9, 2D6 and 3A4 with more than 78 % classification accuracy. We then built a single-label eight-class model and a multilabel five-class model to classify substrates of eight CYP enzymes and to classify substrates that can be metabolized by more than one CYP enzymes, respectively. Above 90 % and >80 % prediction accuracy was achieved for the single-label and multilabel models, respectively. The main improvement of our models over existing ones is the automated and unbiased selection of descriptors by genetic algorithms, which makes our methods applicable for larger data sets and increased number of CYP enzymes.

KEYWORDS:

Bioinformatics; Decision tree; Enzymes; Genetic algorithm; Neural network

PMID:
27478177
DOI:
10.1002/minf.201100052

Supplemental Content

Loading ...
Support Center