Send to

Choose Destination
J Chem Inf Comput Sci. 2003 May-Jun;43(3):743-52.

Text Influenced Molecular Indexing (TIMI): a literature database mining approach that handles text and chemistry.

Author information

Department of Molecular Systems, Merck Research Laboratories, 126 East Lincoln Avenue, RY50SW-100, Rahway, New Jersey 07065-0900, USA.


We present an application of a novel methodology called Text Influenced Molecular Indexing (TIMI) to mine the information in the scientific literature. TIMI is an extension of two existing methodologies: (1) Latent Semantic Structure Indexing (LaSSI), a method for calculating chemical similarity using two-dimensional topological descriptors, and (2) Latent Semantic Indexing (LSI), a method for generating correlations between textual terms. The singular value decomposition (SVD) of a feature/object matrix is the fundamental mathematical operation underlying LSI, LaSSI, and TIMI and is used in the identification of associations between textual and chemical descriptors. We present the results of our studies with a database containing 11,571 PubMed/MEDLINE abstracts which show the advantages of merging textual and chemical descriptors over using either text or chemistry alone. Our work demonstrates that searching text-only databases limits retrieved documents to those that explicitly mention compounds by name in the text. Similarly, searching chemistry-only databases can only retrieve those documents that have chemical structures in them. TIMI, however, enables search and retrieval of documents with textual, chemical, and/or text- and chemistry-based queries. Thus, the TIMI system offers a powerful new approach to uncovering the contextual scientific knowledge sought by the medical research community.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center