Format

Send to

Choose Destination
See comment in PubMed Commons below
J Chem Inf Model. 2011 Apr 25;51(4):788-806. doi: 10.1021/ci100297y. Epub 2011 Mar 29.

MultiMCS: a fast algorithm for the maximum common substructure problem on multiple molecules.

Author information

1
Strand Life Sciences, Fifth Floor, Kirloskar Business Park, Bellary Road, Hebbal, Bangalore 560024, India. ramesh@strandls.com

Abstract

Several efficient correspondence graph-based algorithms for determining the maximum common substructure (MCS) of a pair of molecules have been published in the literature. The extension of the problem to three or more molecules is however nontrivial; heuristics used to increase the efficiency in the two-molecule case are either inapplicable to the many-molecule case or do not provide significant speedups. Our specific algorithmic contribution is two-fold. First, we show how the correspondence graph approach for the two-molecule case can be generalized to obtain an algorithm that is guaranteed to find the optimum connected MCS of multiple molecules, and that runs fast on most families of molecules using a new divide-and-conquer strategy that has hitherto not been reported in this context. Second, we provide a characterization of those compound families for which the algorithm might run slowly, along with a heuristic for speeding up computations on these families. We also extend the above algorithm to a heuristic algorithm to find the disconnected MCS of multiple molecules and to an algorithm for clustering molecules into groups, with each group sharing a substantial MCS. Our methods are flexible in that they provide exquisite control on various matching criteria used to define a common substructure.

PMID:
21446748
DOI:
10.1021/ci100297y
[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for American Chemical Society
    Loading ...
    Support Center