Send to

Choose Destination
J Chem Inf Model. 2019 Feb 25;59(2):713-730. doi: 10.1021/acs.jcim.8b00617. Epub 2019 Feb 12.

Rational Use of Heterogeneous Data in Quantitative Structure-Activity Relationship (QSAR) Modeling of Cyclooxygenase/Lipoxygenase Inhibitors.

Author information

Pirogov Russian National Research Medical University , Ostrovitianov str. 1 , Moscow , 117997 , Russia.
Institute of Biomedical Chemistry , Pogodinskaya Str., 10/8 , Moscow , 119121 , Russia.
School of Pharmacy , Aristotle University , Thessaloniki , 54124 , Greece.
School of Health and Medical Care , Alexander Technological Educational Institute of Thessaloniki , Thessaloniki , 57400 , Greece.
National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , Rockville , Maryland 20850 , United States.


Numerous studies have been published in recent years with acceptable quantitative structure-activity relationship (QSAR) modeling based on heterogeneous data. In many cases, the training sets for QSAR modeling were constructed from compounds tested by different biological assays, contradicting the opinion that QSAR modeling should be based on the data measured by a single protocol. We attempted to develop approaches that help to determine how heterogeneous data should be used for the creation of QSAR models on the basis of different sets of compounds tested by different experimental methods for the same target and the same endpoint. To this end, more than 100 QSAR models for the IC50 values of ligands interacting with cyclooxygenase 1,2 (COX) and seed lipoxygenase (LOX), obtained from ChEMBL database were created using the GUSAR software. The QSAR models were tested on the external set, including 26 new thiazolidinone derivatives, which were experimentally tested for COX-1,2/LOX inhibition. The IC50 values of the derivatives varied from 89 μM to 26 μM for LOX, from 200 μM to 0.018 μM for COX-1, and from 210 μM to 1 μM for COX-2. This study showed that the accuracy of the models is dependent on the distribution of IC50 values of low activity compounds in the training sets. In the most cases, QSAR models created based on the combined training sets had advantages in comparison with QSAR models, based on a single publication. We introduced a new method of combination of quantitative data from different experimental studies based on the data of reference compounds, which was called "scaling".


Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center