Format

Send to

Choose Destination
J Comput Aided Mol Des. 2016 Mar;30(3):209-17. doi: 10.1007/s10822-015-9893-9. Epub 2015 Dec 31.

Autocorrelation descriptor improvements for QSAR: 2DA_Sign and 3DA_Sign.

Author information

1
Departments of Chemistry, Pharmacology, and Biomedical Informatics, Center for Structural Biology, Institute for Chemical Biology, Vanderbilt University, 7330 Stevenson Center, Station B 351822, Nashville, TN, 37235, USA.
2
Institute of Biochemistry, Leipzig University, Brüderstraße 34, 04103, Leipzig, Germany.
3
Departments of Chemistry, Pharmacology, and Biomedical Informatics, Center for Structural Biology, Institute for Chemical Biology, Vanderbilt University, 7330 Stevenson Center, Station B 351822, Nashville, TN, 37235, USA. jens@meilerlab.org.

Abstract

Quantitative structure-activity relationship (QSAR) is a branch of computer aided drug discovery that relates chemical structures to biological activity. Two well established and related QSAR descriptors are two- and three-dimensional autocorrelation (2DA and 3DA). These descriptors encode the relative position of atoms or atom properties by calculating the separation between atom pairs in terms of number of bonds (2DA) or Euclidean distance (3DA). The sums of all values computed for a given small molecule are collected in a histogram. Atom properties can be added with a coefficient that is the product of atom properties for each pair. This procedure can lead to information loss when signed atom properties are considered such as partial charge. For example, the product of two positive charges is indistinguishable from the product of two equivalent negative charges. In this paper, we present variations of 2DA and 3DA called 2DA_Sign and 3DA_Sign that avoid information loss by splitting unique sign pairs into individual histograms. We evaluate these variations with models trained on nine datasets spanning a range of drug target classes. Both 2DA_Sign and 3DA_Sign significantly increase model performance across all datasets when compared with traditional 2DA and 3DA. Lastly, we find that limiting 3DA_Sign to maximum atom pair distances of 6 Å instead of 12 Å further increases model performance, suggesting that conformational flexibility may hinder performance with longer 3DA descriptors. Consistent with this finding, limiting the number of bonds in 2DA_Sign from 11 to 5 fails to improve performance.

KEYWORDS:

2D autocorrelation; 3D autocorrelation; Artificial neural network; Descriptor; Quantitative structure activity relationship; Virtual high-throughput screening

PMID:
26721261
PMCID:
PMC4803518
[Available on 2017-03-01]
DOI:
10.1007/s10822-015-9893-9
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Springer Icon for PubMed Central
Loading ...
Support Center