Send to

Choose Destination
J Chem Inf Comput Sci. 2002 Nov-Dec;42(6):1407-14.

Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients.

Author information

Computer Assisted Drug Discovery, Johnson & Johnson Pharmaceutical Research and Development, L.L.C., Raritan, New Jersey 08869, and Spring House, Pennsylvania 19477, USA.


2D fragment-based similarity searching is one of the most popular techniques for searching a large database of chemical structures and has been widely applied in drug discovery. However, its performance, especially its effectiveness in retrieving active structural analogues, has not been adequately studied. We report a series of computational experiments, where we systematically studied the influence of structural descriptors and similarity coefficients on the effectiveness of similarity searching. The study was conducted using two public large data sets, NCI anti-AIDS and MDDR. Four sets of 2D linear fragment descriptors, based on the original definitions of atom pairs and atom sequences, were compared. The effect of using the Tanimoto coefficient and the Euclidean distance was studied as a function of descriptor set. The results clearly indicate that the Tanimoto coefficient is superior to the Euclidean distance in 2D-fragment based similarity searching, in terms of hit rate, while atom sequences demonstrate the best overall performance among the structural descriptors we studied.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center