Format

Send to

Choose Destination
J Cheminform. 2018 Apr 3;10(1):17. doi: 10.1186/s13321-018-0271-1.

A confidence predictor for logD using conformal regression and a support-vector machine.

Author information

1
Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden.
2
Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden. ola.spjuth@farmbio.uu.se.

Abstract

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

KEYWORDS:

Conformal prediction; LogD; Machine learning; QSAR; RDF; Support-vector machine

Supplemental Content

Full text links

Icon for Springer Icon for PubMed Central
Loading ...
Support Center