Send to

Choose Destination
ChemMedChem. 2018 Mar 20;13(6):540-554. doi: 10.1002/cmdc.201700561. Epub 2018 Jan 29.

Mapping of the Available Chemical Space versus the Chemical Universe of Lead-Like Compounds.

Author information

Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France.
Laboratory of Chemoinformatics and Molecular Modeling, Department of Organic Chemistry, A.M. Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya str., 420008, Kazan, Russia.
Department of Chemistry and Biochemistry, University of Berne, 3 Freiestrasse, 3012, Berne, Switzerland.


This is, to our knowledge, the most comprehensive analysis to date based on generative topographic mapping (GTM) of fragment-like chemical space (40 million molecules with no more than 17 heavy atoms, both from the theoretically enumerated GDB-17 and real-world PubChem/ChEMBL databases). The challenge was to prove that a robust map of fragment-like chemical space can actually be built, in spite of a limited (≪105 ) maximal number of compounds ("frame set") usable for fitting the GTM manifold. An evolutionary map building strategy has been updated with a "coverage check" step, which discards manifolds failing to accommodate compounds outside the frame set. The evolved map has a good propensity to separate actives from inactives for more than 20 external structure-activity sets. It was proven to properly accommodate the entire collection of 40 m compounds. Next, it served as a library comparison tool to highlight biases of real-world molecules (PubChem and ChEMBL) versus the universe of all possible species represented by FDB-17, a fragment-like subset of GDB-17 containing 10 million molecules. Specific patterns, proper to some libraries and absent from others (diversity holes), were highlighted.


computer chemistry; generative topographic mapping; library comparison; molecular diversity; structure analysis


Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center