Format

Send to

Choose Destination
J Chem Inf Model. 2011 Nov 28;51(11):2843-51. doi: 10.1021/ci200282z. Epub 2011 Oct 18.

Power keys: a novel class of topological descriptors based on exhaustive subgraph enumeration and their application in substructure searching.

Author information

1
Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, USA. puliu45@gmail.com

Abstract

We present a novel class of topological molecular descriptors, which we call power keys. Power keys are computed by enumerating all possible linear, branch, and cyclic subgraphs up to a given size, encoding the connected atoms and bonds into two separate components, and recording the number of occurrences of each subgraph. We have applied these new descriptors for the screening stage of substructure searching on a relational database of about 1 million compounds using a diverse set of reference queries. The new keys can eliminate the vast majority (>99.9% on average) of nonmatching molecules within a fraction of a second. More importantly, for many of the queries the screening efficiency is 100%. A common feature was identified for the molecules for which power keys have perfect discriminative ability. This feature can be exploited to obviate the need for expensive atom-by-atom matching in situations where some ambiguity can be tolerated (fuzzy substructure searching). Other advantages over commonly used molecular keys are also discussed.

PMID:
21955134
DOI:
10.1021/ci200282z
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center