Format

Send to

Choose Destination
Structure. 2019 Sep 3;27(9):1469-1481.e3. doi: 10.1016/j.str.2019.06.001. Epub 2019 Jul 3.

Building a Hybrid Physical-Statistical Classifier for Predicting the Effect of Variants Related to Protein-Drug Interactions.

Author information

1
Department of Chemistry, Yale University, New Haven, CT 06520, USA.
2
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
3
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Yale School of Medicine, Yale University, New Haven, CT 06520, USA.
4
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.
5
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Department of Computer Science, Yale University, New Haven, CT 06520, USA. Electronic address: mark@gersteinlab.org.

Abstract

A key issue in drug design is how population variation affects drug efficacy by altering binding affinity (BA) in different individuals, an essential consideration for government regulators. Ideally, we would like to evaluate the BA perturbations of millions of single-nucleotide variants (SNVs). However, only hundreds of protein-drug complexes with SNVs have experimentally characterized BAs, constituting too small a gold standard for straightforward statistical model training. Thus, we take a hybrid approach: using physically based calculations to bootstrap the parameterization of a full model. In particular, we do 3D structure-based docking on ∼10,000 SNVs modifying known protein-drug complexes to construct a pseudo gold standard. Then we use this augmented set of BAs to train a statistical model combining structure, ligand and sequence features and illustrate how it can be applied to millions of SNVs. Finally, we show that our model has good cross-validated performance (97% AUROC) and can also be validated by orthogonal ligand-binding data.

KEYWORDS:

drug resistance; machine learning; nsSNV; protein-drug interactions

PMID:
31279629
DOI:
10.1016/j.str.2019.06.001

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center