NCBI » Bookshelf » Madame Curie Bioscience Database » Drug Design » Analysis of Chemical Space
 
eurekah
Madame Curie Bioscience Database
Landes Bioscience
biochemistrycancercell biologydevelopmental biologygeneticsimmunologyinfectious diseasesmedical geneticsmicrobiologymolecular biologyneuroscience

Analysis of Chemical Space

Gisbert Schneider
A3780

The critical and age-old question remains: how should a chemist decide what to synthesize?” (W.P. Walters, M.T. Stahl, M.A. Murcko)1

A Virtual Screening Philosophy

A main goal of virtual screening is to select activity-enriched sets of molecules—or single molecules exhibiting desired activity—from the space of all synthetically accessible structures. Currently the most advanced HTS techniques allow for testing of ~105 compounds per day, and a typical corporate screening library contains several hundred thousand samples. Although these facts alone represent a technological revolution, the turnover numbers still are extremely small compared with the total size of chemical space.1 As a consequence, even ultra HTS combined with fast, parallel combinatorial chemistry can only be successful if a reasonable pre-selection of molecules (or molecular building blocks) for screening is done. Otherwise this approach will more or less represent a random search with a very small probability of success. While HTS and ultraHTS have made significant progress in recent years, we should bear in mind that it will be very costly to screen a million of compounds for activity in all the new receptor assays (estimated $0.1 to $10 per compound per screen). Even if a company has these resources, it is rare that they have access to a diverse one-million-compound screening library. Thus it can be advantageous to integrate VS tools into the drug discovery process to find leads with novel scaffolds by either starting from competitor compounds described in the literature and/or from a proprietary, existing scaffold. Once a reliable VS process has been defined it can save resources and limit experimental efforts by suggesting defined sets of molecules.

To reliably calculate prediction values or properties, the molecules under investigation must be represented in a suitable fashion. In other words, the appropriate level of abstraction must be defined to perform rational VS. A convenient way to do this is to employ molecular descriptors, which can be used to generate molecular encoding schemes reaching from general properties (e.g., lipophilicity, molecular weight, total charge, volume in solution, etc.) to very specific structural and pharmacophoric attributes (e.g., multi-point pharmacophores, field-based descriptors).2 Filtering tools can be constructed using a simplistic model relating the descriptors to some kind of bioactivity or molecular property. However, the selection of appropriate descriptors for a given task is not trivial and careful statistical analysis is required.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f1.jpg.

Figure 1

.

Virtual screening scheme (adapted from Walters et al1).

Usually, in the beginning of a medicinal chemistry project one wants to perform a rather coarse-grain sieving of compounds displaying interesting bioactivity. Several such filtering rules have been compiled based on the analysis of known drugs and bioactive molecules.1,3–8 A more detailed description of the “drug-likeness” concept will be given in Chapter 4. As the knowledge about the required molecular features grows during a project, the VS technique becomes increasingly more fine-grained. An overview of a typical VS process is shown in Figure 1.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f2.jpg.

Figure 2

.

Graphical representations of four different types of trees. Human perception easily classifies these patterns as “tree”—a difficult task for technical information processing systems.

Besides an appropriate representation of the molecules under investigation, any useful feature extraction system must be structured in such a way that meaningful analysis and pattern recognition is possible. Technical systems for information processing are intuitively considered as mimicking some aspects of human capabilities in the fields of perception and cognition. Despite great achievements in artificial intelligence research during the past decades, we are still far from understanding complex biological information processing systems in detail. This means that a feature extraction task that appears very simple to a human expert can be extremely hard or even impossible to solve for a technical system, e.g., a particular virtual screening software. To demonstrate the often fuzzy nature of features let us consider the depictions shown in Figure 2. Human perception readily classifies these patterns as “tree”. We “intuitively” know the generalizing features defining the pattern class “tree”. This is however a difficult task for technical information processing systems because the common features of the four tree patterns are not easily rationalized and described. Useful attributes might include something like “the number of coherent areas” and “ball and stick”. Due to the fact that it is very often impossible to give a description of relevant features in sufficient details, pattern recognition systems are often confronted with vaguely and incompletely described tasks.9,10 This is true for the tree-classification example, and also for molecular pattern recognition and virtual screening tasks. Some very basic feature extraction methods will be presented and discussed in this Chapter, more advanced systems that are of particular interest for adaptive SAR modeling are described in Chapter 3.

Logical Inference

A chemist's decision as to which molecules to synthesize next is usually based on the available facts about a particular project, expert knowledge that was acquired over years, and to some extent on intuition. Software containing knowledge about a particular, limited, real-world problem can assist in this decision-making process (Definition 2.1).11,12 It is important to note that virtual screening systems are thought to complement the abilities of a human expert, e.g., by analyzing very large sets of data and prioritizing many different designs. Some aspects of human decision-making and reasoning can be adapted or mimicked by “intelligent” software, but many features will probably remain a domain of human perception, cognition, and intuition.

Definition 2.1

Expert systems are computer programs that help to solve problems at an expert level.12

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f4.jpg.

Figure 4

.

Some important scaffold structures that are amenable to solid phase combinatorial synthesis

Based on an appropriate representation of knowledge, logical inferences are made by man and machine through deduction, abduction, and induction mechanisms (see Fig. 4). The modus ponens probably is the best known inference rule:

Given facts (axioms): IF i = A THEN j = B i = A Inference (modus ponens): j = B

Additional important inference rules are the modus tollens, and the chain rule which combines several implications:

Given facts (axioms): IF i = A THEN j = B IF i = B THEN k = C Inference (chain rule): IF i = A THEN k = C

Deductive logical programming using predicate logic is based on such rules.13,14 They can be used to derive hypotheses from true facts, i.e., they only consider the syntactic structure of the expressions. Several applications of logical reasoning systems have been described in the context of drug design purposes.15–17 Induction seems to be especially suited to perform learning from examples, and the chemical similarity approach can be regarded as founded on this concept, as illustrated by the following simplifying case:

Given facts: The pyrimidine derivative A is active in assay X The pyrimidine derivative B is active in assay X ... Inference (induction): All pyrimidines are active in assay X

It must be stressed that induction is useful to derive new hypotheses but is not a legal inference in a strict sense. This simply means that the conclusion “All pyrimidines are active in assay X” can be wrong. In contrast, deduction is denoted a legal inference because if only true axioms are given then the conclusions drawn by deduction are also true. For further details on logical reasoning and inferencing, see the literature.13,18–20 Inductive logic programming (ILP)18 represents a relatively new addition to the field of logic programming, which seems to be appropriate for SAR and SPC (structure-property correlation) modeling tasks.15 According to Plotkin,21 an inductive learning task can be described using a background theory (facts and rules), sets of positive and negative examples (e.g., active and inactive molecules), a candidate hypothesis and a partial ordering system for alternative hypotheses, where the following conditions apply:

  1. The background knowledge should not be sufficient to explain all positive examples; otherwise the problem would already be solved (prior necessity).

  2. The background knowledge should be consistent with all negative and positive examples (prior satisfiability).

  3. The background knowledge and the hypothesis should together explain all positive examples (posterior sufficiency) and should not contradict any of the negative examples (strong posterior consistency); in the presence of noise, logical consistency is sufficient (weak posterior consistency).

  4. If there are several hypotheses which fulfill conditions 1, 2, and 3, then the most general hypothesis should be selected as the result.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f3.jpg.

Figure 3

.

A Venn diagram representing the class relationship of the 20 genetically coded amino acids according to Taylor.25 This grouping of residues has been successfully applied to finding generalizing patterns in amino acid sequences.

In addition to drug design projects, ILP has found several successful applications in bioinformatical sequence analysis and protein structure prediction.22,23 The software PROMIS represents an early machine learning program written in the programming language PROLOG, which implements a generate-and-test hill-climbing beam search technique to find patterns in amino acid sequences.24 The idea is to find a sequence of classes (“rules”) that can be used to discriminate between different sets of amino acid sequences. The algorithm is a typical example of a population-based stochastic searching technique. It is related to a (μ+λ) evolution strategy (see Chapter 1): Starting from an initial set of general rules, new rules are formed by means of “generalization”, “specialization” and “extension”. Every newly formed rule is assessed by a fitness function (e.g., classification accuracy or coverage), and is either rejected or selected as a parent rule of the next optimization cycle. The beam searching idea—which can be regarded as analogous to using multiple parents in ES/GA techniques—is thought to reduce the risk of getting trapped in a local optimum. Compared to evolutionary algorithms, however, no adaptive strategy parameters were used in the original algorithm. Within the PROMIS software, Taylor's classification of amino acids is employed to describe similarity between residues and extract patterns that can be used for sequence classification (Fig. 3).25 Residue classes are easily represented by PROLOG expressions in the knowledge base, allowing for the construction of generalizing patterns formed by a sequence of class identifiers (ALL, SMALL, CHARGED, HYDROPHOBIC, AROMATIC, etc.):

class (ALL, [A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y]). class (SMALL, [A,C,D,G,N,P,S,T,V]). class (HYDROPHOBIC, [A,C,F,G,H,I,K,L,M,T,V,W,Y]). class (AROMATIC, [F,H,W,Y]) ...

One particular such rule is listed below. It was generated by a modified version of PROMIS and gives an idea of characteristic features found in peptide substrates of the mitochondrial processing peptidase (MPP):26–28

ALL VERY_HYDROPHOBIC OR SMALL OR LYSINE POSITIVE LARGE NEUTRAL NEUTRAL ALL

In a given matching sequence, this MPP cleavage site pattern starts with the class “ALL”. Moving towards the C-terminal end of the sequence, the next position must match one of the residues described by “VERY_HYDROPHOBIC OR SMALL OR LYSINE”, the second next residue must be positively charged, and so on. Such machine-generated rules can help to find and understand function-determining patterns in amino acid sequences. This general feature extraction approach complements other pattern matching routines used in sequence analysis.29,30 Its principle of using generic molecular descriptors (here: residue classes) is very similar to establishing an SAR model for drug design by adaptive rule formation.

This short introduction to some general inference mechanisms was to demonstrate one possible approach to how experimental observations and expert knowledge can be represented as facts and rules, and conclusions can be drawn that might help the medicinal chemist to generate new hypotheses (Fig. 4). Learning by induction provides a theoretical framework for many SAR/SPC modeling tasks. As we have learned from many years of “artificial intelligence” research it is extremely difficult (if not impossible) to develop virtual screening algorithms mimicking the medicinal chemists' intuition. Furthermore, there is no common “gut feeling” as different chemists have different educational background, skills and experience. Despite such limitations there is, however, substantial evidence that it is possible to support drug discovery in various ways with the help of computer-assisted library design and selection strategies. There are two specific properties of computers, which make them very attractive for virtual screening applications:

  1. By help of virtual library construction hitherto unknown parts of chemical space can easily be explored, and

  2. the speed and throughput of virtual testing (fitness or quality calculations) can be far ahead of what is possible by means of “wet bench” experimental systems.

Chemical Compound Libraries

Table 1

Some major databases that are useful for virtual screening experiments (adapted from Eglen et al33)
DatabaseNo. of moleculesDescription
ACDa> 250,000Available Chemicals Directory; catalogue of commercially available specialty and bulk chemicals from over 225 international suppliers
Beilsteinb> 7,000,000Covers organic chemistry from 1779
CSDc> 200,000Cambridge Structural Database; experimentally determined three-dimensional structures of small molecules
CMCa> 7,000Comprehensive Medicinal Chemistry database; structures and activities of drugs having generic names (on the market)
MDDRa> 85,000MACCS-II (MDL) Drug Data Report; structures and activity data of compounds in the early stages of drug development
MedChemd> 35,000Medicinal Chemistry database; pharmaceutical compounds
SPRESId> 3,400,000Substances and bibliographic data abstracted from the world's chemical literature
WDIe> 50,000World Drug Index; pharmaceutical compounds from all stages of development
a

Molecular Design Limited, San Leandro, CA, U.S.A.

b

Beilstein Informationssysteme GmbH, Frankfurt, Germany

c

CSD Systems, Cambridge, UK.

d

Daylight Chemical Information Systems Inc., Claremont, CA, U.S.A.

e

Derwent Information, London, U.K.

Two complementary compound sources are accessible for virtual screening, databases of known structures and de novo designs (including enumerated combinatorial libraries). Some major databases frequently employed for virtual screening experiments are listed in Table 1. In addition, several companies offer large libraries of both combinatorial and historical collections on a commercial basis. Usually the combinatorial collections contain 100k–500k structures, whereas commercially available historical collections rarely exceed 100k compounds. Most of the major pharmaceutical companies have compound collection in the 300k+ range. Combinatorial libraries usually provide small amounts of uncharacterized compounds for screening. Once these samples are fully characterized—e.g., by HPLC and mass spectroscopy, the data are of interest for structure-activity purposes. In most companies, these compounds are also present with the “historical” collection of compounds, generally derived from classical medicinal chemistry programs, most of which have very well-defined chemical characteristics. Commercial compound collections can also be purchased that fall between these two extremes. Collectively, therefore, the information used to relate biological activity and chemical structure must clearly integrate all of these types of compounds.

Assessment of the diversity of a compound library is often a first step in virtual screening. The most relevant approach is clearly to assess the diversity space using chemical criteria and several algorithms are now available to do that. It is likely that after diversity analysis and extensive experimental screening of the library, at several targets and targets classes, the structure-activity database will point to areas of success and failure in terms of identifying leads. Thus the library may be said to be “GPCR-modulator rich”, “kinase-inhibitor poor” etc. An experiment-based understanding of the screening library diversity should also provide compounds that are “frequent hitters”, i.e. compounds that are not necessarily chemically reactive, but have structures that repeatedly bind to a range of targets via unspecific interactions or cause a false-positive signal for other assay-inherent reasons. Clearly removal of these compounds from the library is an advantage in HTS, as is an understanding of the reason for their promiscuity of interaction. A further issue relates to identifying a screening library subset, ostensibly representative of the diversity of the whole library, that is screened at all targets, usually as a priority in the screening campaign. Assessment of chemical versus operational understanding of diversity is critical in the design of the library subset. Moreover, there are advantages in screening the whole library. First, since HTS or uHTS is generally unconstrained by cost or compound usage, it is as easy to screen 250k compounds, as it is to screen 25k. Second, the screening campaign increases the likelihood of finding actives, especially for difficult targets, as well as finding multiple structurally distinct leads. Indeed, a direct comparison of the approach of screening a representative library has been reported from Pfizer, in which it was noted that 32 out of the 39 leads were missed in comparison to those found by screening the whole library.31 Alternatively, Pharmacopeia have reported that receptor antagonists for the CxCR2 receptor and the human bradykinin B1 receptor were derived from the same 150k compound library, made using the same four combinatorial steps. Noteworthy, this library was neither based on known leads in the GPCR field nor specifically targeted towards GPCRs. On the other hand, researchers at Organon reported that it is possible to rationally select various “actives” from large databases using appropriate “diversity” selection and “representativity” methods.32

The introduction of combinatorial chemistry, HTS and the presence of large compound selections have put us in the comfortable position, that there is a large number of hits to choose from for lead optimization—at least for certain classes of drug targets. We anticipate that while the size of the compound libraries and the number of high-throughput screens will continue to increase leading to a larger number of hits, the number of leads actually being followed up per project will roughly remain the same. The challenge is to select the most promising candidates for further exploration and computational techniques will play a very important role in this process. Assuming a hit rate of 0.1–1% and a compound-collection size of 106 compounds, we have (or will have) about 1k–10k hits that are potential starting points for further work. It is important to realize that while the screening throughput has increased significantly, the throughput of a traditional chemistry lab has not. While it is true, that automated and/or parallel chemistry is now routinely used there are still many molecules that are not amenable to these more automated and high-throughput approaches. Therefore the question is: “How can subsequent lead optimization fully exploit this vast amount of information?” Computational techniques can be used to address the question in a variety of ways:33

Combinatorial library enumeration provides a straightforward way to prepare large virtual collections of molecules that are chemically feasible. The idea is to define sets of molecular building blocks and a list of chemical reactions for virtual synthesis. Both building blocks and reactions should be close to what is tractable in the laboratory to facilitate the synthesis of selected candidates. However, the real synthons must not necessarily be employed for virtual library construction. The stock of virtual building blocks can be compiled from commercially available structures (e.g., from the ACD), fictive structures, and from retrosynthetic fragmentation of already known molecules.33,34 Generally, the term “building block” refers to variant structural parts of a combinatorial library, where the different building blocks present in a structure are denoted by R1, R2 etc. The “scaffold” contains the invariant structural attributes of a combinatorial library, and a “linker” can be any scaffold or building block with two combinatorial attachment sites. In Fig. 4 some typical combinatorial library scaffolds are shown.

Both natural polymers like peptides or nucleic acids and small organic molecules provide building blocks and scaffolds for virtual library construction. Limited diversity, a preference for flexible, linear structures and usually bad pharmacokinetic properties are problematic issues tied to natural polymer libraries. In contrast small molecule libraries can cover a large diversity space, often contain rigid molecules and “unnatural” structures, often have desired pharmacokinetic properties and can more easily be optimized in the lead optimization phase of drug discovery (Fig. 1.1).

Similarity Searching

Chemical similarity searching is a straightforward practical approach to identify candidate molecules by pair-wise comparison of compounds. In its simplest form, the result of a similarity search in a compound database is a ranked list, where high-ranking structures are considered to be more similar to the query in a certain sense than low-ranking molecules. If either the query structure(s) or the database structures or both structures reveal a certain (desired or undesired) property or activity, some conclusions may be drawn for the molecules under investigation. Structures are compared based on a similarity value that is calculated from their molecular descriptors. There are two assumptions inherent to this idea, representing the hypothesis “if molecule A is more similar to the query molecule R than molecule B, then molecule A might more likely show some biological activity that is comparable to the activity of R”:

  1. The molecular representation (descriptor) is assumed to appropriately cover those molecular attributes which are relevant for the underlying SAR/SPR.

  2. The similarity measure applied is assumed to accurately relate differences in molecular descriptions to differences in the quality function ( Principle of Strong Causality).35

In the past, the analysis of assay data was primarily performed by medicinal chemists, looking at the active compounds and then deciding which hits the efforts should be focused on. First, with the increase in the number of experimentally determined hits, this approach becomes increasingly ineffective and computational techniques are increasingly used to classify the hits and derive hypotheses. Second, one should keep in mind that it is basically impossible for a human being also to take into account the large number of inactive compounds. The development of pharmacophore hypothesis, for example, typically requires the incorporation of information on inactive compounds.

By similarity searching, sets of candidate structures can be rapidly compiled from databases or virtual chemical libraries. Practical experience shows that such hypotheses are often weak and there clearly is no cure-all recipe or generally valid hypothesis leading to success in chemical similarity searching. Nevertheless, similarity searching provides a useful concept. A practicable measure of success can be expressed by an enrichment factor, ef, giving the ratio of the fraction of active molecules in the selected subset compared to the fraction of actives in the total pool (database). This value may be regarded as an estimate of the enrichment obtained compared to a random selection of molecules, as given by Equation 2.1.

graphic element

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f5.jpg.

Figure 5

.

Distributions of amino acids in chemical space resulting from the selection of different molecular descriptors. Amino acids are denoted in single letter code. a) Volume109 and hydrophobicity;110 b) bulkiness111 and refractivity.111

A large number of molecular descriptors has been developed over the past decades (Definition 2.2).2 The particular selection of a molecular representation defines a chemical space, and thus the ordering of molecules within this space. The choice of descriptors influences the distribution of structures. In Fig. 5 two distributions of the 20 genetically encoded amino acids are shown as an example.

Definition 2.2

“The molecular descriptor is the final result of a logical and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment.” (according to Todeschini and Consonni)2

Data scaling is usually the first step of chemical similarity searching, feature extraction, hypothesis generation, and other types of virtual screening and machine learning. The most frequently applied scaling methods include scaling by range (Eq. 2) and scaling by standard deviation (autoscaling, Eq. 3). For most applications autoscaling is a method of choice, leading to data with zero mean and unit variance. In some cases, vector normalization to length one is a necessary preprocessing procedure (Eq. 4).

graphic element

where i is the row index, and k is the column index of the raw data matrix X.

graphic element

In Equations 3 and 4 n is the number of objects (molecules). Autoscaling results in data vectors scaled to a length of

graphic element

Various similarity measures exist that can be used for chemical similarity searching. Very often a distance value dAB between pairs of molecules A and B (i.e., their descriptors ξA and ξB containing n elements each) forms the basis on which a similarity value is calculated. The frequently used Manhattan distance (also called Hamming distance or City-Block distance; Eq.5) and the Euclidean distance (Eq.6) are the first two examples of a general distance metric, the Minkowski or Lp-metric (Eq.7; see Eq. 1.9).

graphic element
graphic element
graphic element

The similarity measure based on the Minkowski distance can be used to express molecular similarity (Eq.8). Completely similar or identical structures have a similarity value of sAB = 1, completely dissimilar molecules have sAB = 0.

graphic element

where dAB(max) represents the maximal pair-wise distance found in the data set under investigation, e.g., the maximal distance between the query structure and a database compound. Many additional distance and similarity measures have found application in chemical similarity searching.36,37 The Tanimoto coefficient probably is the best known similarity index that is applied to comparison of bitstring representations of molecules (although it its application is not restricted to dichotomous variables). The set-theoretic definition of the Tanimoto coefficient is given by Equation 9, where χA is the number of bits set to 1 in the bitstring vector coding for molecule A, and χB is the number of bits set to 1 in the bitstring vector coding for molecule B. The range of values of the Tanimoto similarity measure is [0,1] for dichotomous variables.

graphic element

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f6.jpg.

Figure 6

.

Structures retrieved by similarity searching taking Midazolam (left) as the query structure. Top line: Tanimoto/Daylight method; Bottom line: CATS method.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f7.jpg.

Figure 7

.

Coding a chemical structure by the CATS topological atom type descriptor. A 2D molecular structure (a) is converted to the molecular graph (b), generalizing atom types are assigned (c), and the frequency of every atom pairs with a distance between 1 and 10 bonds is determined (d). For five atom types (lipophilic, L; hydrogen bond donor, D; hydrogen bond acceptor, A; positively charged, P; negatively charged, N) there are 15 possible pairs, resulting in a 15 × 10 = 150-dimensional histogram representing a molecular structure. In (d) an L-A pair over nine bonds is shown.

Different ranked lists are obtained from different similarity searching methods. The different results originate from different molecular descriptors and different similarity measures. The particular choice of a distance or similarity criterion and the selection of molecular descriptors are subject to a learning process in each medicinal chemistry project. Similarity-based VS of large databases and virtual libraries needs representations of the molecules that are both effective and efficient, i.e., they must be able to differentiate between molecules that are different, and they must be quick to calculate. An example of different similarity searching results obtained with the same query structure (midazolam) is given in Fig. 6. In this example the molecules were coded by two different descriptors and also compared by two different similarity measures. In Fig. 6a the common Daylight chemical fingerprints served as a molecular representation,38 and the Tanimoto coefficient (Eq.9) was used for similarity searching. In Fig. 6b a simple topological pharmacophore descriptor was used together with the Euclidean distance measure (Eq.6).39 The idea of the particular topological pharmacophore representation is illustrated in Fig. 7.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f8.jpg.

Figure 8

.

Schematic of the NAPAP-thrombin complex. On the left the most important interactions between the thrombin inhibitor NAPAP and the thrombin active site are shown. A simple pharmacophore model of thrombin activity is given on the right. L: lipophilic, D: hydrogen-bond donor, A: hydrogen-bond acceptor, P: positively charged or ionizable group. Pharmacophore models can be used for similarity searching and de novo design exercises.

Pharmacophore models are particularly useful for drug design purposes and widely applied molecular representations (Definition 2.3).40 The idea is to consider a set of generalized atom types—e.g., H-bond donors and acceptors, lipophilic and charged groups—and their constellation in space, i.e., distances between atom type centers, as a “fingerprint” of a molecule. It is hoped that this abstraction from chemical structure (“meta-description”) represents function-determining molecular features, and facilitates grouping of isofunctional compounds (Fig. 8). An in-depth treatment of 2D and 3D pharmacophore modeling is beyond the scope of this book. Much research has been done in this very important and active area of virtual screening. The interested reader is referred to the literature.40,41 The usefulness of topological pharmacophores for similarity searching will be demonstrated along with a worked example in the following. Their particular advantage is that they can be quickly calculated and no 3D alignment of conformers is required.

Definition 2.3

A pharmacophore or pharmacophoric pattern is the ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target structure and to trigger (or to block) its biological response. (IUPAC recommendation 1997)42

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f9.jpg.

Figure 9

.

Structure of mibefradil (1), a calcium channel blocking agent, and four selected isofunctional hits (2–5), which were retrieved by virtual database screening using the CATS software. Taking structure 5 as a query for CATS, a closely related structure (6) to mibefradil is retrieved. RTTC: recombinant T-type calcium channel, FLIPR: fluorometric imaging plate reader.

The following example demonstrates straightforward database similarity searching using CATS. Ion channels are essential to a wide range of physiological functions such as neuronal signaling, muscle contraction, cardiac pacemaking, hormone secretion and cell proliferation.43 There is evidence that brain T-type calcium channels can modulate excitability and give rise to burst-firing in some CNS neurons. Recently, certain anti-epileptic drugs with anxiolytic properties have been reported to display significant T-channel blocking activity—thus indicating scopes of selective T-type channel blockers as novel neuropsychiatric therapeutics.44,45 Mibefradil has been described as the first T-type selective calcium channel blocking agent showing about 20-fold selectivity over L-type channels (structure 1 in Fig. 9).46,47 The application of this compound was however hampered by some drug-drug interaction liabilities. As a consequence, several lead finding initiatives were undertaken based on pharmacophoric features of mibefradil—aiming at a translation into novel scaffolds of lead structures with improved molecular properties and suited for high-speed chemical optimization. These new structures should then serve as starting points to develop novel brain- and heart-selective T-type calcium channel antagonists. The first goal of the lead identification approaches was thus to identify novel lead structures with affinity and selectivity for T-type calcium channels comparable to mibefradil—but with reduced structural complexity, low cytochrome-P450 interaction liability, and with molecular properties indicating scope for chemical optimization. Mibefradil served as the query structure to virtually screen the Roche corporate compound database using CATS. The twelve highest-ranking compounds, which passed certain molecular property filters, were selected and transmitted to the biological screening. Nine out of these twelve molecules (75%) showed significant T-type calcium channel antagonistic activity, with IC50 values comparable to the value of mibefradil. Some selected compounds are depicted in Fig. 9. The molecular architectures of the CATS hits are strikingly different from the mibefradil template—however certain common structural features are preserved, particularly in structures 2 and 3 given in Fig. 9. This common theme includes a central spacer or chain with an amino group—providing a positive charge at physiological pH—framed by two rather extended substructures, each containing an aromatic group. The topological length of the linker matches well with the one present in mibefradil. The CATS hit 4, which has the lowest similarity value of the hits shown, is most distinguished from mibefradil and has a carboxamido functional group in the central chain instead of amino functionality, which constitutes an (inducible) positive partial charge at this site. Compared to mibefradil, all three CATS hits are structurally less complex, with lower molecular weights (MW < 400) and lower lipophilicity values (clogP values: 4.6, 4.3 and 4.0 for compounds 2, 3 and 4; clogP = 6.1 for mibefradil 1). With respect to potential drug-drug interaction liabilities, the in silico similarity to a cytochrome-P450 binding model (pharmacophore model) for 2, 3 and 4 are much lower than for 1 (F. Hoffmann-La Roche Ltd.; the prediction model was developed by H. Fischer, S. Poli, and M. Kansy; unpublished). The compounds offer broad scope for chemical optimization and were further characterized in-depth (F. Hoffmann-La Roche Ltd.; W. Neidhart, T. Giller, G. Schmid, G. Adam; unpublished). In the frame of selection and characterization of analogues of hit 3, the cyclodecyl derivative 5 shown in Fig. 9 was found to exhibit the highest T-type calcium channel inhibitory activity of all structures of this type. Subsequent CATS similarity searching based on 5 as the template (query) structure gave rise to the tetrahydro-naphthalene structure 6 as a further hit. The compound is again a close analogue of mibefradil 1 and was prepared in a campaign to obtain improved analogues. This finding emphasizes the potential of iterative similarity searching to transgress the borders of chemical scaffolds—in identifying and translating molecular determinants—to obtain novel biologically active molecules.

Feature Extraction Methods

A common theme in molecular feature extraction is the transformation of raw data to a new co-ordinate system, where the axes of the new space represent “factors” or “latent variables”—features that might help to explain the shape of the original distribution. By far the most widely applied statistical feature extraction method in drug design belongs to the class of factorial methods: principal component analysis (PCA).12,48,49 PCA performs a linear projection of data points from the high-dimensional space to a low-dimensional space. In addition to PCA, non-linear projection methods like self-organizing maps (SOM), encoder networks, and Sammon mapping are sometimes employed in drug design projects.50 Since none of these methods require the knowledge of target values (e.g., inhibition constants, properties) or class membership (e.g., active/inactive assignments) they are termed “unsupervised”. Unsupervised procedures can be used to perform a first data analysis step, complemented by supervised methods later during the adaptive molecular design process.

Principal component analysis. PCA performs a projection of the m-dimensional data matrix X down to a d-dimensional subspace by means of the projection matrix LT, yielding the object co-ordinates in this plane, S (Eq.10; Eq.11). S is termed the score matrix with n rows (objects, molecules) and d columns (principal components). L is termed the loading matrix with d columns and p rows, and T denotes the matrix transpose.

graphic element
graphic element

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f10.jpg.

Figure 10

.

Principal components (PC) of a set of two-dimensional data. The original co-ordinate system is spanned by x1 and x2. The orthogonal score vectors s1 and s2 are calculated according to the criterion of maximum variance.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f11.jpg.

Figure 11

.

Projections of the distribution of 73 molecules generated by the four-component UGI reaction (top). Structures were encoded by the 150-dimensional CATS topological atom pair descriptor, the plots represent projections to a two-dimensional space. Shading indicates activity in a thrombin binding assay (black: IC50 < 1μM; white: IC50 > 10 μM). a) PCA projection, b) Sammon map, c) encoder network projection (see Fig. 12b), d) toroidal SOM containing (7 x7) neurons. The molecules were synthesized and assayed at F. Hoffmann La-Roche Ltd.112

The principal components (PC) are determined on the basis of the maximal variance criterion, i.e., the first PC represents a regression line along the direction of maximal data variance, and the second PC is orthogonal to the first PC along the criterion of maximum variance, and so on (Fig. 10). According to this most of the total data variance is contained (“explained”) by the first PCs. The loading matrix contains the regression coefficients of each PC with the original axes (here: molecular descriptors), and the new co-ordinates (PC) are linear combinations of the original variables. Loadings plots give the correlation of the original variables with selected principal components. Graphical inspection or interpretation of the—usually varimax-rotated—loadings matrix can be a helpful step towards an understanding of essential function-determining molecular features or pharmacophores (Definition 2.3). If all PCs are used in the model, 100% of the original data variance is explained. The sum of the squared loadings is also called the eigenvalue of a principal component. For data reduction or visualization, usually only the first two or three PCs are used, i.e., the PCs with the greatest eigenvalues. Further details about PCA and related approaches can be found in the literature.12,51 Fig. 11a gives the projection of 73 UGI reaction products in the plane spanned by the first two principal components derived from PCA of a 150-dimensional topological pharmacophore space (CATS atom pair descriptor). The UGI reaction is shown in Fig. 11. This score plot is a linear projection of the original high-dimensional space. There is a clear separation of active molecules (IC50 < 1 μM) and inactive molecules (IC50 > 10 μM) as judged from a thrombin inhibition assay. Based on this projection one could argue that the molecular descriptor seems to be useful to find thrombin inhibitors.

A simple but useful algorithm for PCA is given by the NIPALS (nonlinear iterative partial least squares) technique. The following description of the algorithm is taken from Otto.12

NIPALS Algorithm for Principal Component Analysis

Step 0

Scale the raw data matrix by the mean and normalize to length one

Step 1

Estimate the loading vector lT

Step 2

Compute the score vector: s = X l;

Compare the new and the old score vector. If the deviations of the elements of the two vectors are within a threshold (e.g., < 10−5) then go to Step 5, otherwise go to Step 3.

Step 3

Compute new loadings: lT = sT X;

Normalize the loading vector to length one:

graphic element

Step 4

Repeat from Step 2 if the number of iterations does not exceed a predefined threshold (e.g., 100 iterations); otherwise go to Step 5.

Step 5

Determine the matrix of residuals: E = X−s lT.

If the number of principal components is equal to the number of previously fixed or desired components then go to Step 7; otherwise continue at Step 6.

Step 6

Use the matrix of residuals E as the new X-matrix and compute additional principal components s and loadings lT by means of Step 1.

Step 7

As a result, the matrix X is represented by a principal component model ac- cording to Eq.10.

Sammon Mapping

The aim of Sammon's algorithm is to project points from a high-dimensional ( m-dimensional) space to a low-dimensional (n-dimensional) space which usually is two-dimensional. It is conventionally applied to exploratory data analysis. The original algorithm is an iterative method based on a gradient search.52 It finds a data distribution in the n-dimensional target space so that as much as possible of the original distribution in the m-dimensional space is preserved. In this non-linear mapping (NLM), the inter-point distances between vectors in the lower-dimensional space approximate the corresponding distances in the original m-dimensional space. (Note: The idea of Kruskal's mapping procedure is very similar to Sammon's mapping algorithm: again the inter-point distances in the n-dimensional space approximate the corresponding distances in the m-dimensional space, but the original distances are transformed by some monotonic, increasing function.53) The basis for mapping is given by the inter-point distance matrix. Multidimensional scaling (MDS) is a related technique which is also based on a similarity or dissimilarity matrix. A thorough comparison between Sammon's and Kruskal's mapping and MDS can be found elsewhere.54 Sammon's mapping is an optimization procedure starting from an initial configuration of n-dimensional vectors (e.g., randomly chosen or by taking n columns of the m-dimensional matrix X with maximum variances). A reasonable error function E is called “Sammon's stress” (Eq.12), measuring how well a distribution of k points in the n-space matches the distribution of k points in the m-space (i.e., the difference between the distance matrix of the original vector set, d, and the projected vector set, d):

graphic element

An optimization algorithm is applied to decrease the stress, e.g., a steepest descent method to search for the minimum of the error function. Having found the distribution in the n-space after the t-th iteration, the new setting at time t+1 is given by Equation 13.

graphic element

where η is the learning rate, and

graphic element

Of course, several optimization methods can be used to find a minimum of the error function E. After optimization a Sammon map represents the relative distances of vectors in a high-dimensional space and is thus useful in determining the number and the shapes of clusters, as well as their relative distances. Fig. 11b shows the distribution of the UGI reaction products obtained by our implementation of Sammon's mapping. In contrast to the gradient search technique originally proposed by Sammon, we have applied a (1, λ) evolution strategy to this task. Again the active molecules (black spots) are separated from the inactives (white circles). Compared to the PCA result given in Fig. 11a, the map shows a broader distribution of the data and allows for the identification of two clusters among the inactive compounds—which is not visible in the PCA score plot. Clearly, the nonlinear map provides greater individual detail compare to PCA. By preserving the inter-point distances of the original samples NLM is able to represent the topology and structural relationships of a data set. Despite this appealing feature we must keep in mind that the projection does lead to some loss of information, and the resulting map can be misleading.54

The Sammon projection map complements those obtained with the SOM algorithm ( vide infra), auto-associative neural networks (AANN), multilayer perceptron (MLP, see Chapter 3) and principal component (PC) feature extractor. Lerner demonstrated for the example of chromosome classification that Sammon's (unsupervised) mapping is superior to classification based on the AANN and PC feature extractor and highly comparable with that based on the (supervised) MLP.55 Further thorough comparison of these and other supervised and unsupervised methods and application to a wide range of classification and feature extraction tasks must be performed to substantiate these findings. Additional information and different approaches to the NLM task can be found elsewhere.56–59

Encoder Networks

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f12.jpg.

Figure 12

.

Encoder networks for nonlinear mapping of high-dimensional data. Neurons are drawn as circles, weights are represented by lines. Input neurons (white) are fan-out units, hidden-layer units (black) have a sigmoidal or linear activity, and the output neurons (gray) are linear

a) symmetrical network architecture attempting to reproduce the input patterns by going through a low-dimensional internal representation. Factor 1 and Factor 2 are the score values (co-ordinates) in the low-dimensional (here: two-dimensional) map.

b) conventional feed-forward network with two output neurons. The outputs represent the low-dimensional scores.

A neural network approach to the task of data projection termed “encoder” networks or ReNDeR (Reversible Non-linear Dimension Reduction) networks is of growing interest for data visualization and nonlinear feature extraction.60 The architecture of an encoder network is illustrated in Fig. 12a. The network is symmetrical around a central parameter layer. The number of input and output neurons is defined by the dimension of the data vectors, and the idea of the approach is to reproduce the input patterns at the output layer (auto-association) via an internal representation which is of lower complexity than the original data. In Fig. 12 the parameter layer consists of only two neurons—although this layer can have an arbitrary number of neurons—for a reduction of the data to only two dimensions (“factors”). Once the network weights are optimized by conventional supervised training techniques the input data can be described by the output values of the neurons in the parameter layer. Two or three neurons are especially useful for graphical display. If no intermediate hidden layers are used and the neurons have a linear transfer function the factors found by the encoder network are equivalent to principle components.61 Presence of hidden layers containing non-linear neuron activities enables the system to perform non-linear mappings of the data, quite similar to Kohonen mapping.62 This “nonlinear PCA” seems to be especially suited for mapping brain activities in cognitive science.63 An attractive feature of encoder networks is the possibility to (re)construct data of the original dimension directly from low-dimensional projections. In a two-dimensional map derived from high-dimensional QSAR data, for example, any position is directly linked to the corresponding original space. This might be useful for compound selection with a desired property (or activity) by inspection of a low-dimensional graphical display. Until now, however, this “molecular design” technique is still in its infancy. First applications, advantages and drawbacks of the method have been critically discussed and can be found elsewhere.60,64–66

The use of feed-forward neural networks for data mapping has recently been highlighted by Agrafiotis and Lobanov.54 The idea is very similar to the encoder network approach described above, but in contrast to the network architecture shown in Figure 12a, their network is non-symmetrical (Fig. 12b). The number of output neurons determines the dimension of the projection. The network is trained in a conventional supervised manner aiming at a minimization of the mapping error E (Eq.12)—or similar error functions, e.g., Kruskal's stress. We have implemented such a system, again employing the (1,l) evolution strategy for network training (F. Hoffmann-La Roche Ltd.; O. Roche, G. Schneider; unpublished). Fig. 11c gives the result for our example, the projection of UGI reaction products from a 150-dimensional space to the plane. Our mapping network contained 150 input neurons, three sigmoidal hidden neurons, and two linear output neurons. The distribution looks very similar to the Sammon map shown in Figure 11b, thereby supporting this NLM. However, in contrast to the Sammon procedure where the mapping function remains unknown, the trained network represents the nonlinear mapping function. Now we can also project a new sample to the low-dimensional display. Compared to the original encoder approach (Fig. 12a) the network is smaller and more suitable for optimization. A drawback of this technique and the Sammon map is the fact that we do not know the meaning of the display axes—which is not the case for PCA. These approaches therefore nicely complement each other.

Self-Organizing Networks

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f13a.jpg.

Figure 13A

.

Architecture of a self-organizing map (SOM). Network containing (6 × 5) = 30 neurons. Each neuron is a four-dimensional vector represented by a stack of four cubes. An input signal (pattern vector x) leads to a response of a single neuron (“winner-takes-all”, gray-colored). Usually the top-down view of an SOM is shown.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f13b.jpg.

Figure 13B

.

Architecture of a self-organizing map (SOM). A toroidal SOM (top-down view). The neurons in the first and the second neighborhood to the gray-shaded neuron are indicated by black lines. The star symbol is in the second neighborhood of the neuron.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f14.jpg.

Figure 14

.

Principle of vector quantization. In this example, two-dimensional data vectors (pattern vectors; open arrowheads) form two distinct clusters. During the vector quantization process neuron vectors (filled arrowheads) move toward the centers of the clusters, thereby forming the cluster centroids.

Complementing the visualization techniques described in this Chapter, the self-organizing map (SOM) has proven its usefulness for drug discovery, in particular for the tasks of data classification, feature extraction, and visualization. Therefore this method will be described in some more detail. The SOM belongs to the class of unsupervised neural networks and was pioneered by Kohonen in the early 1980's.67 Among other applications, e.g., in robotics,68 it can be used to generate low-dimensional, topology-preserving projections of high-dimensional data. SOMs contain only a single layer of neurons. In contrast to the supervised, multi-layered ANN discussed in other Chapters, the neurons of an SOM do not compute an output value from incoming signals. Rather they represent vectors of the same dimension as the input patterns (Fig. 13A, Fig. 13B) and adopt either an “active” or an “inactive” state. For data processing the input pattern (a molecular descriptor vector) is compared to all neurons in the output layer, and the one neuron vector that is most similar to the input pattern—the so-called “winner neuron”—fires a signal, i.e., it is active. All other neurons are inactive. In this way, each pattern is assigned to exactly one neuron. The data patterns belonging to a neuron form a cluster, as they are more similar to their neuron than to any other neuron of the SOM. During the SOM training process—an optimization procedure following the principles of unsupervised Hebbian learning—the original high-dimensional space is tessellated, resulting in a certain number of data clusters. There are formed as many clusters as are neurons in the SOM. The neurons represent prototype vectors of each cluster. This process is similar to vector quantization (Fig. 14), and the resulting prototype vectors capture features in the input space that are unique for each data cluster. Feature analysis can be done, e.g., by comparing adjacent neurons.

Kohonen's algorithm represents a strikingly efficient way for mapping similar patterns, given as vectors close to each other in input space, onto contiguous locations in the output space.67 This is achieved by introducing a topology to the SOM neuron layer. The simplest topology is a chain of neurons, followed by a two-dimensional grid. Topological mapping can be achieved by two simple rules:

  1. Locate the best-matching neuron (winner neuron).

  2. Increase matching at this unit and its topological neighbors.

For the first rule only vector distances between the input patterns x and the neurons w must be calculated (Eq.15). The number of comparisons needed depends linearly on the size of the self-organizing system C which can be expressed by the number of neurons, c.

graphic element

The second rule requires an updating procedure to adapt the vector elements of the winner neuron s and its topological neighbors (Eq.16), where Ns is the topological neighborhood around the neuron s and ϵ = ϵ(d( c, s),t) is a learning rate depending on both the topological distance d( c, s) between s and the neuron c, and on the time t. In this context, time is usually measured in number of input patterns presented to the network. In many applications a Gaussian neighborhood function is used. A toroidal neuron topology can be used to avoid some boundary problems inherent to a planar topology (Fig. 13A, Fig. 13B).67,69,70

The complete SOM algorithm can be formulated as follows:

graphic element

The Self-Organizing Map Algorithm

Step1

  • Initialize the self-organizing map A to contain N = N1 * N2 neurons ci:

graphic element

with reference vectors Wci □ R n chosen randomly according to p(ξ) from the set of training patterns.

  • Initialize the connection set C to form a rectangular N1 x N2 grid.

  • Initialize the time parameter t = 0.

Step 2

  • Generate at random an input signal ξ according to p(ξ).

Step 3

  • Determine the winner neuron according to Equation 15

graphic element

Step 4

  • Adapt each neuron r according to

where the Hamming distance d1 is used to measure the neuron-to-neuron distance on the SOM grid, and a Gaussian neighborhood around the winner neuron s is used

graphic element

with the standard deviation of the Gaussian:

graphic element

and

graphic element

The time-dependent calculation of σ and e require initial and final values that must be defined prior to SOM training.

Step 5 Increase the time parameter: t = t + 1.

Step 6 If t < tmax then continue with Step 2, otherwise terminate.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f15.jpg.

Figure 15

.

Stages of SOM adaptation. A planar (10 × 10) SOM was trained to map a two-dimensional data distribution (small black spots). The receptive fields of the final map are indicated by Voronoi tessellation in the lower left projection. A and B denote two “empty” neurons, i.e., there are no data points captured by these neurons. The simulation was performed using the SOM tutorial software written by H.S. Loos and B. Fritzke;113 Figures were adapted from the graphical www output.

In Fig. 15 some snapshots of an SOM training process are shown. In this example a two-dimensional neuron grid adapts to a two-dimensional data distribution (actually, in this simplifying example no dimensionality reduction takes place). As a result, topologically adjacent neurons correspond to adjacent input patterns. The “winner-takes-all” SOM training algorithm forces the weight values of the network to move towards centroids of the data distribution and become a set of prototype vectors. All data points located within the “receptive field” of an output neuron will be assigned to the same cluster. The receptive fields of a fully trained UNN are comparable to the areas defined by Voronoi tessellation (or Dirichlet tessellation) of the input space.61,71All data vectors that are closer to the weight vector of one neuron than to any other weight vector belong to its receptive field. The mapping error can be defined by the mean quantization error, mqe (Eq.17),72 where N is the total number of molecules used for SOM training, Rc is the receptive field of a neuron, c; x is the m-dimensional molecular descriptor, and w is the m-dimensional cluster centroid (neuron vector):

graphic element

Many variations and extensions of Kohonen's algorithm have been published ever since his original paper appeared. For a recent overview, see for example a volume by Oja and Kaski.73 One major limitation of the original SOM algorithm is that the dimension of the output space and the number of neurons must be predefined prior to SOM training. Self-organizing networks with adapting network size and dimension provide more advanced and sometimes more adequate solutions to data mining and feature extraction.72 A disadvantage of Kohonen-networks can be the comparatively long training time needed, especially if large data sets are used since every data vector must be presented several times to the network for weight adaptation. Hybrid multi-layered UNN which can be trained extremely fast employing very large data sets have already been developed.71,74,75 These systems can provide an alternative classification tool to Kohonen-networks if real-time or on-line computation is required, e.g., for control and analysis of HTS results. Usually they contain more than two layers of neurons where from layer to layer a more subtle data classification is performed, and only parts of the network are adapted during one training cycle. This reduces the training time needed since a data vector is not compared to every weight vector as in classical Kohonen-networks. Especially combinations of supervised and unsupervised learning techniques are under steady development and represent a very active area of current research. The authors of this book are convinced that such systems will become an indispensable part of bio- and chemoinformatics in the field of drug discovery. Several practical applications of the SOM to compound classification, drug design, and chemical similarity searching are described in the following part of this Chapter.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f16.jpg.

Figure 16

.

Structures of two potent thrombin inhibitors: PPACK (left) and Argatroban (right).

Figure 11d gives a (10 × 10) SOM obtained for the data set of UGI reaction products. A striking advantage of the SOM over the other three projections shown in Figure 11 is the automatic classification of data, i.e., cluster detection and definition of cluster boundaries. Black neurons contain inhibitors, white neurons contain inactive compounds. The gray-shaded square (1/3) represents a mixed cluster. For subsequent similarity searching, we can use the neuron vectors representing the characteristic features of strong thrombin binders: taking the vector of neuron (2/4) as the query for searching the 33k entries of the MedChem Database (version 1997, distributed by Daylight Chemical Information Systems Inc., Irvine, CA, USA), the well known nanomolar thrombin inhibitors PPACK and Argatroban were retrieved (Fig. 16). Both molecules, PPACK and Argatroban differ in their scaffold architecture from the UGI products. This example demonstrates that the CATS topological atom type descriptor can be useful for “backbone hopping”, and that the SOM technique is a useful means to extract function-determining molecular features. Several variations of this principle have been developed and successfully applied to retrieving novel active structures from databases.76–78 Related descriptors of molecular topology have been developed and productively used in virtual screening experiments by several research groups.79–85 A fast and straightforward multiple pharmacophore approach, which builds on an ensemble of 3D hypotheses, has been published recently by Bradley and coworkers addressing some of the limitations inherent to 2D techniques.86 Further information about pharmacophore extraction and related techniques can be found elsewhere.87–90

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f17.jpg.

Figure 17

.

Virtual screening for potential antidepressants by a self-organizing map. The SOM represents a topology-preserving visualization of a high-dimensional chemical space spanned by 150 descriptors (CATS). The distribution of a set of known 597 antidepressants is indicated by gray-shading (white: only antidepressants; black: only other drugs; gray: mixed cluster). A separation of antidepressant agents and “other” drugs can be observed. The two known antidepressants imipramine and fluoxetine were predicted to fall in the “antidepressants area” [neuron (4/7) and neuron (4/9)], and the NK1 inhibitor NKP-608 is located on an “activity island” on the map [neuron (7/4)].

The plot shown in Fig. 17 was obtained by training a self-organizing map using the software tool NEUROMAP.91 It demonstrates the ability of the CATS descriptor to separate antidepressants from other drugs. All molecules were compiled from the Derwent World Drug Index (Derwent Information, London). Areas in the upper left corner of the SOM are dominated by antidepressant agents, whereas the remaining parts of the map are populated by “other” drug molecules. Again, the SOM can now be used as a coarse-grain virtual screening tool by projecting (“dropping”) new molecules onto the map, e.g., virtual combinatorial libraries or corporate database entries. For demonstration of this idea three known antidepressants were omitted during the map training process, namely imipramine, fluoxetine, and NKP-608 which was developed by a research group at Novartis.92,93 Imipramine and fluoxetine were predicted to fall into the “antidepressants area”, and the recently identified NK1 (substance P-preferring tachykinin) receptor antagonist NKP-608 is located on an adjacent “activity island” on the map. Interestingly, NKP-608 is assigned to a cluster containing several classical antidepressants like e.g., tianeptine, a known serotoninergic agent. Now it would be challenging to test tianeptine derivatives for NK1 binding capability. Irrespective of the outcome of such assays, these virtual screening results clearly support the applicability of the topological pharmacophore descriptor that was chosen for molecule representation in conjunction with the SOM: The three compounds given as examples in Figure 17 would have been identified as potential novel antidepressant agents.

Comparison of substance classes and compound libraries is a further application area of the SOM. In the following example, this method was used to compare drugs to a compilation of “nondrugs” in “Ghose & Crippen”-space.94–96 Each molecule was coded by a 120-dimensional vector giving the fragments counts of 120 molecular fragments defined by Ghose and coworkers.97,98 For graphical display the molecule distributions in this 120-dimensional two-point pharmacophore space were projected onto a toroidal map consisting of (15 × 15) = 225 neurons (clusters). To determine the raw classification accuracy of an SOM, the correlation coefficient, cc, according to Matthews was calculated (Eq. 18).99 In Equation 18, P is the number of positive correct predictions, N is the number of negative correct predictions, O is the number of false-positive predictions (overprediction), and U is the number of false-negative predictions (underprediction).

graphic element

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f18.jpg.

Figure 18

.

SOM projection of a chemical space filled with 4,998 drugs and 4,282 nondrugs. The frequencies of 120 Ghose & Crippen fragments were used to encode each molecule. Each square represents a cluster of molecules (Voronoi region). Note that the (10 × 10) map forms a torus. Data sets courtesy of J. Sadowski.

a) the ratio of drugs and nondrugs clustered is shown by grey scale shading (white: pure nondrug cluster, black: pure drug cluster).

b) binary classification of the distribution shown in (a). The Matthews correlation coefficient for this classification is cc = 0.48.

To see whether the descriptor is able to separate “drugs” from “nondrugs”and thus may be used to analyze “drug-relevant” chemical spacean SOM was developed using Sadowski's collection of drugs and nondrugs (Fig. 18).94 This data set was compiled from the WDI (4,998 drugs) and the Available Chemicals Directory (ACD; 4,282 nondrugs). For details about this data set and its limitations, see Chapter 4 and the original publication by Sadowski and Kubiniyi.94 The SOM projection reveals a pronounced separation between drugs and nondrugs in Ghose & Crippen-space, as indicated by the light and dark areas in Fig. 18a. To estimate the classification ability of this SOM, a binary pattern class assignment was introduced (Fig. 18b): A cluster was regarded as “drug-like” (white color) if it contained more than 50% of drug molecules, otherwise it was regarded as belonging to the “nondrug-like” class (black color). This straightforward binary class assignment clearly shows a drug and a nondrug region, with 3,637 drugs (73%) and 3,155 nondrugs (75%) correctly classified. This corresponds to a Matthews correlation coefficient of cc = 0.48. From the observation of distinct preferred drug and nondrug regions we concluded that the molecular descriptor might be suited for the analysis of “drug-relevant” space. The binary prediction accuracy obtained is low compared to the much higher accuracy that can be yielded by supervised feature extraction techniques and more problem-specific molecular descriptors (see Chapters 3 and 4). It must be stressed that in this study the SOM was not intended to form a prediction system. The aim was to visualize the distributions of compound libraries in a high-dimensional space.

The extension of this approach to a comparison of natural products and trade drugs and consequent virtual library design was done by Lee and Schneider.96 It was demonstrated that natural compounds provide interesting novel scaffold architectures, which can be used in combinatorial drug design approaches. However, in most cases the scaffolds will have to be modified to provide synthetic feasibility and stability and prevent adverse pharmacokinetic effects. Taking such a natural scaffold in combination with synthetic side-chains might become a typical strategy in future drug design.100

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch449f19.jpg.

Figure 19

.

SOM showing the distribution of 5726 trade drugs (left) and a 40 × 40 = 1600-member combinatorial library (right) in CATS topological pharmacophore space. Apparently these two compound collections do not overlap significantly.

A straightforward method for combinatorial library design using the SOM technique is illustrated in Figure 19. To determine the usefulness of a combinatorial library the members of the corresponding virtual library can be projected onto a map displaying the distribution of a set of reference compounds. In our example 5,726 trade drugs (from WDI) served as this reference set. The question was whether the scaffold structure shown in Figure 19 might provide a good starting point for the combinatorial design of drug-like structures. A virtual 40 × 40 = 1,600-member combinatorial library was built using 40 generic molecular building blocks. The CATS topological pharmacophore descriptor was used to encode these molecules. The map showing the distribution of the 5,726 trade drugs (Fig. 19 left) and the virtual library (Fig. 19 right) in CATS topological pharmacophore space reveals that apparently these two compound collections do not overlap. From this observation one may conclude that: i) either the scaffold structure does not represent a drug-like substrate, and therefore the two libraries do not significantly overlap; or ii) the virtual library complements the collection of trade drugs in such a way that a larger proportion of pharmacophore space could be accessed. Of course, these considerations are of a purely theoretical nature, and only a series of practical experiments will teach us whether this particular combinatorial library has drug-like or nondrug-like characteristics. This general approach has been proven to be very useful in attempts to prioritize combinatorial libraries and scaffolds, and assessment of external vendor libraries to extend the corporate compound database.

The applicability of the SOM for mapping elements of protein structure-like secondary structure elements or surface pockets was demonstrated recently.101–103Knowledge of the 3D structure of a target protein undoubtedly is a rich source of information for computer-aided drug design. Of special interest are the size and form of the active site, and the distribution of functional groups and lipophilic areas. Due to the fact that the number of solved X-ray structures of proteins is rapidly increasing and thus the amount of information available, it is desirable to address questions related to coverage of the protein structure universe, conserved patterns of functional groups, or common ligand binding motifs.104,105 It is evident that such an analysis cannot be performed by visual inspection of structural models only. Automatic procedures for analysis, prediction, and comparison of macromolecular structuresin particular potential binding sites in proteins will be a very helpful tool.106 One such implementation of a computational method developed at Roche includes four steps:103 i) automated detection of protein surface pockets; ii) generation of a property-encoded solvent accessible surface (SAS) for each pocket; iii) generation of correlation vectors of the SAS to obtain rotation- and translation-invariant descriptors; and iv) SOM projection of these vectors onto a low-dimensional display. As a result, a two-dimensional map is obtained showing the distribution of surface cavities in a chemical property space. This method was originally applied to a set of 176 proteins from the Protein Data Base (PDB)107 containing a catalytically active zinc ion in the active site. On the resulting SOM, with only a small degree of mis-classifications the active site pockets were clearly separated from other surface cavities. A more detailed analysis revealed that the automated mapping of the active sites accurately reflects established enzyme classification. Such a projection and analysis technique can give new insight into local structural similarities between enzymes revealing completely different folds and functions. Furthermore, the SOM mapping technique allowed for the correct classification of surface pockets derived from proteins that were not contained in the training set. We are convinced that this and other similar techniques bear a significant potential for automated protein structure analysis and drug design.108 If possible, the analysis of macromolecular (target) features should parallel feature extraction from sets of known ligands to obtain desired novel designs.

References
1.
Walters WP, Stahl MT, Murcko MA. Virtual screening: An overview. Drug Discovery Today. 1998; 3: 160178.
2.
Todeschini R, Consonni V. Handbook of Molecular Descriptors Weinheim, New York: Wiley-VCH2000.
3.
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev. 1997; 23: 325. [PubMed]
4.
Ghose AK, Viswanadhan VN, Wendoloski JJ. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J Comb Chem. 1999; 1: 5568. [PubMed]
5.
Teague SJ, Davis AM, Leeson PD, Oprea T. The design of leadlike combinatorial libraries. Angew Chem Int Ed Engl. 1999; 38: 37433748. [PubMed]
6.
Wang J, Ramnarayan K. Toward designing drug-like libraries: A novel computational approach for prediction of drug feasibility of compounds. J Comb Chem. 1999; 1: 524533. [PubMed]
7.
Xu J, Stevenson J. Drug-like index: A new approach to measure drug-like compounds and their diversity. J Chem Inf Comput Sci. 2000; 40: 11771187. [PubMed]
8.
Oprea TI. Property distribution of drug-related chemical databases. J Comput Aided Mol Des. 2000; 14: 251264. [PubMed]
9.
Bremermann HJ. What mathematics can and cannot do for pattern recognition In: Grüsser OJ, Klinke R, eds. Zeichenerkennung durch biologische und technische Systeme Berlin: Springer Verlag1973. 3145.
10.
Lohmann R. Structure evolution in neural systems In: Soucek B and the IRIS Group, eds. Dynamic, Genetic, and Chaotic Programming New York: John Wiley & Sons Inc,1992. 395411.
11.
Haugeland J. Artificial Intelligence: The Very Idea Cambridge: The MIT Press, 1989.
12.
Otto M. Chemometrics, Statistics and Computer Application in Analytical Chemistry Weinheim, New York: Wiley-VCH, 1999.
13.
Clocksin WF, Mellish CS. Programming in PROLOG 2nd ed. Berlin, Heidelberg: Springer, 1984.
14.
Li D, Liu D. A Fuzzy PROLOG Database System New York: John Wiley & Sons Inc, 1990.
15.
King RD, Muggleton S, Lewis RA, Sternberg M J E. Drug design by machine learning. Proc Natl Acad Sci USA. 1992; 89: 1132211326. [PubMed] [Free Full Text in PMC icon.Free Full text in PMC]
16.
King RD, Muggleton SH, Srinivasan A, Sternberg MJ. Structure-activity relationships derived by machine learning: the use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc Natl Acad Sci USA. 1996; 93: 438442. [PubMed] [Free Full Text in PMC icon.Free Full text in PMC]
17.
King RD, Srinivasan A. The discovery of indicator variables for QSAR using inductive logic programming. J Comput Aided Mol Des. 1997; 11: 571580. [PubMed]
18.
Muggleton S. Inductive Logic Programming London: Academic Press, 1992.
19.
Morik K, Wrobel S, Kietz JU, Emde W. Knowledge Acquisition and Machine Learning: Theory, Models, and Applications London: Academic Press, 1993.
20.
Schulze-Kremer S. Molecular Bioinformatics: Algorithms and Applications Berlin, New York: Walter de Gruyter, 1996.
21.
Plotkin GD. A note on inductive generalization In: Meltzer B, Mitchie D, eds. Machine Intelligence New York: American Elsevier,1970. 153163.
22.
King RD, Karwath A, Clare A, Dehaspe L. Accurate prediction of protein functional class from sequence in the Mycobacterium tuberculosis and Escherichia coli genomes using data mining. Yeast. 2000; 17: 283293. [PubMed] [Free Full Text in PMC icon.Free Full text in PMC]
23.
Turcotte M, Muggleton SH, Sternberg MJ. Automated discovery of structural signatures of protein fold and function. J Mol Biol. 2001; 306: 591605. [PubMed]
24.
King RG. A machine learning approach to the problem of predicting a protein's secondary structure from its primary structure (PROMIS)] Ph.D. Thesis, University of Strathclyde, Strathclyde, UK, 1988.
25.
Taylor WR. The classification of amino acid conservation. J Theor Biol. 1986; 119: 205221. [PubMed]
26.
Schneider G, Wrede P. Signal analysis in protein targeting sequences. Protein Seq Data Anal. 1993; 5: 227236.
27.
Brunner M, Klaus C, Neupert W. The mitochondrial processing peptidase In: von Heijne G, ed. Signal Peptides Austin: RG Landes Company,1994. 7386.
28.
Schneider G, Sjöling S, Wallin E, Wrede P, Glaser E, von Heijne G. Feature-extraction from endopeptidase cleavage sites in mitochondrial targeting sequences. Proteins. 1998; 30: 4960. [PubMed]
29.
Attwood T, Parry-Smith DJ. Introduction to Bioinformatics Essex: Addison Wesley Longman Limited, 1999.
30.
Durbin R, Eddy S, Krogh A, Mitchison G. Biological Sequence Analysis Cambridge: Cambridge University Press, 1998.
31.
Spencer RW. High-throughput screening of historic collections: Observations on file size, biological targets, and file diversity. Biotechnol Bioeng. 1998; 61: 6167. [PubMed]
32.
Bayada DM, Hamersma H, van Geerestein VJ. Molecular diversity and representativity in chemical databases. J Chem Inf Comput Sci. 1999; 39: 110.
33.
Eglen RM, Schneider G, Böhm HJ. High-throughput screening and virtual screening: entry points to drug discovery In: Schneider G, Böhm HJ, eds. Virtual Screening for Bioactive MoleculesWeinheim, New York: Wiley-VCH:114.
34.
Lewell XQ, Judd DB, Watson SP, Hann MM. RECAPRetrosynthetic combinatorial analysis procedure: A powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry. J Chem Inf Comput Sci. 1998; 38: 511522. [PubMed]
35.
Rechenberg I. Optimierung technischer Systeme nach Prinzipien der biologischen Evolution Stuttgart: Frommann-Holzboog, 1973.
36.
Barnard JM, Downs GM, Willett P. Descriptor-based similarity measures for screening chemical databases In: Schneider G, Böhm HJ, eds. Virtual Screening for Bioactive Molecules Weinheim, New York: Wiley-VCH:5980.
37.
Willett P, Barnard JM, Downs GM. Chemical similarity searching. J Chem Inf Comput Sci. 1998; 38: 983996.
38.
Daylight Chemical Information Systems, Inc., 27401 Los Altos, 360 Mission Viejo, CA 92691, USA. http://www.daylight.com .
39.
Schneider G, Neidhart W, Giller T, Schmid G. “Scaffold-Hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chem Int Ed Engl. 1999; 38: 28942896. [PubMed]
40.
Güner OF. Pharmacophore Perception, Development and Use in Drug Design La Jolla: International University Line, Biotechnology Series, 2000.
41.
Good AC, Mason JS, Pickett SD. Pharmacophore pattern application in virtual screening, library design and QSAR In: Böhm HJ, Schneider G, eds. Virtual Screening for Bioactive Molecule New York, Weinheim: Wiley-VCH, 2000. 131159.
42.
Wermuth CG, Ganelli CR, Lindberg P, Mitscher LA. Glossary of terms used in medicinal chemistry (IUPAC Recommendations 1997). Annu Rep Med Chem. 1998; 33: 385395.
43.
Doyle JL, Stubbs L. Ataxia, arrhythmia and ion-channel gene defects. Trends Genet. 1998; 14: 9298. [PubMed]
44.
Perez-Reyes EP. Molecular characterization of a novel family of low voltage-activated, T-type, calcium channels. J Bioenerg Biomembr. 1998; 30: 313317. [PubMed]
45.
Todorovic SM, Prakriya M, Nakashima YM, Nilsson KR, Han M, Zorumski CF. et al. Enantioselective blockade of T-type Ca2+ current in adult rat sensory neurons by a steroid that lacks gamma-aminobutyric acid-modulatory activity. Mol Pharmacol. 1998; 54: 918927. [PubMed]
46.
Mishra SK, Hermsmeyer K. Selective inhibition of T-type Ca2+ channels by Ro 40–5967. Circ Res. 1994; 75: 144148. [PubMed]
47.
Bernardeau A, Ertel EA. Meeting, Montpellier, 1996: Low-voltage-activated T-type Calcium Channels, Adis International, 1998. 386394.
48.
Jackson JE. A User's Guide to Principal Components New York: John Wiley, 1991.
49.
Eriksson L, Johansson E, Kettaneh-Wold N, Wold S. Introduction to Multi- and Megavariate Data Analysis using Projection Methods (PCA & PLS) Umeå:Umetrics, 1999.
50.
Devillers J. Ed. Neural Networks in QSAR and Drug Design London: Academic Press, 1996.
51.
Backhaus K, Erichson B, Plinke W, Schuchard-Ficher C, Weiber R. Multivariate Analysemethoden Berlin: Springer Verlag, 1989.
52.
Sammon JW. A nonlinear mapping for data structure analysis. IEEE Trans Comput. 1969; C-18: 401409.
53.
Kruskal JB. Multidimensional scaling. Psychometrika. 1964; 29: 115129.
54.
Agrafiotis DK, Lobanov VS. Nonlinear mapping networks. J Chem Inf Comput Sci. 2000; 40: 13561362. [PubMed]
55.
Lerner B. On pattern classification with Sammon's nonlinear mapping. Pattern Recognition. 1998; 31: 371381.
56.
Chang CL, Lee R C T. A heuristic method for non-linear mapping in cluster analysis. IEEE Trans Syst Man Cybern. 1973; SMC-3: 197200.
57.
Lee R C Y, Slagle JR, Blum H. Triangulation method for the sequential mapping of points from N-space to 2-space. IEEE Trans Comput. 1977; C-27: 288292.
58.
Biswas G, Jain AK, Dubes RC. Evaluation of projection algorithms. IEEE Trans Pattern Anal Machine Intell. 1981; PAMI-3: 701708.
59.
Mao J, Jain AK. A nonlinear projection method based on Kohonen's topology preserving maps. IEEE Trans Neural Networks. 1995; 6: 296317.
60.
Livingstone DJ. Multivariate data display using neural networks In: Devillers J, ed. Neural Networks in QSAR and Drug Design London: Academic Press,1996. 157176.
61.
Hertz J, Krogh A, Palmer RG. Introduction to the Theory of Neural Computation Redwood City: Addison-Wesley, 1991.
62.
Salt DW, Yildiz N, Livingstone DJ, Tinsley CJ. The use of artificial neural networks in QSAR. Pest Sci. 1992; 36: 161170.
63.
Friston K, Phillips J, Chawla D, Buchel C. Revealing interactions among brain systems with nonlinear PCA. Hum Brain Mapp. 1999; 8: 9297. [PubMed]
64.
Good AC, So SS, Richards WG. Structure-activity relationships from molecular similarity matrices. J Med Chem. 1993; 36: 433438. [PubMed]
65.
Good AC, Peterson SJ, Richards WG. QSAR's from similarity matrices. Technique validation and application in the comparison of similarity evaluation methods. J Med Chem. 1993; 36: 29292937. [PubMed]
66.
Reibnegger G, Werner-Felmayer G, Wachter H. A note on the low-dimensional display of multivariate data using neural networks. J Mol Graph. 1993; 11: 129133. [PubMed]
67.
Kohonen T. Self-Organization and Associative Memory Heidelberg: Springer-Verlag, 1984.
68.
Ritter H, Schulten K, Martinez T. Neuronale NetzeEine Einführung in die Neuroinformatik selbstorganisierender Netzwerke Bonn: Addison-Wesley, 1990. English edition: Neural Networks. Reading: Addison-Wesley,1992.
69.
Graepel T, Obermayer K. A stochastic self-organizing map for proximity data. Neural Comput. 1998; 11: 139155. [PubMed]
70.
Bienfait B, Gasteiger J. Checking the projection display of multivariate data with colored graphs. J Mol Graph Model. 1997; 15: 203215, 254258. [PubMed]
71.
Preparata FP, Shamos MI. Computational Geometry: An Introduction New York: Springer, 1985. [PubMed] [Free Full Text in PMC icon.Free Full text in PMC].
72.
Fritzke B. Growing self-organizing networksHistory, status quo, and perspectives In: Oja E, Kaski S, eds. Kohonen Maps Amsterdam: Elsevier Science BV,1999. 131144.
73.
Oja E, Kaski S. eds. Kohonen Maps Amsterdam: Elsevier Science BV,1999.
74.
Lu T, Lerner J. Spectroscopy and hybrid neural network analysis. Proc IEEE. 1996; 84: 895905.
75.
Melssen WJ, Smits J R M, Buydens L M C, Kateman G. Using artificial neural networks for solving chemical problems. Part II. Kohonen self-organizing feature maps and Hopfield networks. Chemom Intell Lab Syst. 1994; 23: 267291.
76.
Zupan J, Gasteiger J. Neural Networks for Chemists Heidelberg: Wiley-VCH, 1993.
77.
Wagener M, Sadowski J, Gasteiger J. Autocorrelation of molecular surface properties for modeling corticosteroid binding globulin and cytosolic Ah receptor activity by neural networks. J Am Chem Soc. 1995; 117: 77697775.
78.
Anzali S, Mederski W W R K, Osswald M, Dorsch D. Endothelin antagonists: search for surrogates of methylendioxyphenyl by means of a Kohonen neural network. Bioorg Med Chem Lett. 1997; 8: 1116. [PubMed]
79.
Drie JH, Lajiness MS. Approaches to virtual library design. Drug Discovery Today. 1998; 3: 274283.
80.
Andrews KM, Cramer RD. Toward general methods of targeted library design: topomer shape similarity searching with diverse structures as queries. J Med Chem. 2000; 43: 17231740. [PubMed]
81.
Matter H. Selecting optimally diverse compounds from structure databases: A validation study of two-dimensional and three-dimensional molecular descriptors. J Med Chem. 1997; 40: 12191229. [PubMed]
82.
Cho SJ, Zheng W, Tropsha A. Rational combinatorial library design. 2. Rational design of targeted combinatorial peptide libraries using chemical similarity probe and the inverse QSAR approaches. J Chem Inf Comput Sci. 1998; 38: 259268. [PubMed]
83.
Castro EA, Tueros M, Toropov AA. Maximum topological distances based indices as molecular descriptors for QSPR: 2Application to aromatic hydrocarbons. Comput Chem. 2000; 24: 571576. [PubMed]
84.
Gupta S, Singh M, Madan AK. Superpendentic index: A novel topological descriptor for predicting biological activity. J Chem Inf Comput Sci. 1999; 39: 272277. [PubMed]
85.
Gupta S, Singh M, Madan AK. Connective eccentricity index: a novel topological descriptor for predicting biological activity. J Mol Graph Model. 2000; 18: 1825. [PubMed]
86.
Bradley EK, Beroza P, Penzotti JE, Grootenhuis P D J, Spellmeyer DC, Miller JL. A rapid computational method for lead evolution: Description and application to alpha(1)-adrenergic antagonists. J Med Chem. 2000; 43: 27702774. [PubMed]
87.
Milne GW, Nicklaus MC, Wang S. Pharmacophores in drug design and discovery. SAR QSAR Environ Res. 1998; 9: 2338. [PubMed]
88.
Mason JS, Hermsmeier MA. Diversity assessment. Curr Opin Chem Biol. 1999; 3: 342349. [PubMed]
89.
Kirkpatrick DL, Watson S, Ulhaq S. Structure-based drug design: combinatorial chemistry and molecular modeling. Comb Chem High Throughput Screen. 1999; 2: 211221. [PubMed]
90.
Hopfinger AJ, Duca JS. Extraction of pharmacophore information from high-throughput screens. Curr Opin Biotechnol. 2000; 11: 97103. [PubMed]
91.
Schneider G, Wrede P. Artificial neural networks for computer-based molecular design. Prog Biophys Mol Biol. 1998; 70: 175222. [PubMed]
92.
Rupniak NM, Kramer MS. Discovery of the antidepressant and anti-emetic efficacy of substance P receptor (NK1) antagonists. Trends Pharmacol Sci. 1999; 20: 485490. [PubMed]
93.
Papp M, Vassout A, Gentsch C. The NK1-receptor antagonist NKP608 has an antidepressant-like effect in the chronic mild stress model of depression in rats. Behav Brain Res. 2000; 115: 1923. [PubMed]
94.
Sadowski J, Kubinyi H. A scoring scheme for discriminating between drugs and nondrugs. J Med Chem. 1998; 41: 33253329. [PubMed]
95.
Schneider G. Neural networks are useful tools for drug design. Neural Networks. 2000; 13: 1516. [PubMed]
96.
Lee ML, Schneider G. Scaffold architecture and pharmacophoric properties of natural products and trade drugs: Application in the design of natural product-based combinatorial libraries. J Comb Chem. 2001; 3: 284289. [PubMed]
97.
Ghose AK, Pritchett A, Crippen GM. Atomic physicochemical parameters for three dimensional structure directed quantitative structure-activity relationships III: Modeling hydrophobic interactions. J Comput Chem. 1988; 9: 8090.
98.
Viswanadhan VN, Ghose AK, Revankar GR, Robins RK. Atomic physicochemical parameters for three dimensional structure directed quantitative structure-activity relationships. 4. Additional parameters for hydrophobic and dispersive interactions and their application for an automated superposition of certain naturally occurring nucleoside antibiotics. J Chem Inf Comput Sci. 1989; 29: 163172.
99.
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975; 405: 442451. [PubMed]
100.
Harvey A. Strategies for discovering drugs from previously unexplored natural products. Drug Discovery Today. 2000; 5: 294300. [PubMed]
101.
Schuchhardt J, Schneider G, Reichelt J, Schomburg D, Wrede P. Local structural motifs of protein backbones are classified by self-organizing neural networks. Protein Eng. 1996; 9: 833842. [PubMed]
102.
Stahl M, Bur D, Schneider G. Mapping of proteinase active sites by projection of surface-derived correlation vectors. J Comput Chem. 1999; 20: 336347.
103.
Stahl M, Taroni C, Schneider G. Mapping of protein surface cavities and prediction of enzyme class by a self-organizing neural network. Protein Eng. 2000; 13: 8388. [PubMed]
104.
Alberts IL, Nadassy K, Wodak SJ. Analysis of zinc binding sites in protein crystal structures. Protein Sci. 1998; 7: 17001716. [PubMed] [Free Full Text in PMC icon.Free Full text in PMC]
105.
Young MM, Skillman AG, Kuntz ID. A rapid method for exploring the protein structure universe. Proteins. 1999; 34: 317332. [PubMed]
106.
Böhm HJ. Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J Comput-Aided Mol Des. 1998; 12: 309323. [PubMed]
107.
Bernstein FC, Koetzle TF, Williams G J B, Meyer EF, Brice MD, Rodgers JR. et al. The Protein Data Bank: A computer-based archival file for macromolecular structures. J Mol Biol. 1977; 112: 535542. [PubMed]
108.
Verdonk ML, Cole JC, Taylor R. SuperStar: A knowledge-based approach for identifying interaction sites in proteins. J Mol Biol. 1999; 289: 10931108. [PubMed]
109.
Chothia C. The nature of accessible and buried surfaces in proteins. J Mol Biol. 1975; 105: 114. [PubMed]
110.
Engelman DA, Steitz TA, Goldman A. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu Rev Biophys Biophys Chem. 1986; 15: 321353. [PubMed]
111.
Jones DD. Amino acid properties and side chain orientations in proteins: a cross-correlation approach. J Theor Biol. 1975; 50: 167183. [PubMed]
112.
Weber L, Wallbaum S, Broger C, Gubernator K. Optimization of the biological activity of combinatorial compound libraries by a genetic algorithm. Angew Chemie Int Ed. 1995; 34: 22802282.
113.
Loos HS, Fritzke B. Some competitive learning methods University of Bochum: Technical Report, 1997 A neural network demo software is accessible at URL: http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/DemoGNG/GN .
Help ǀ Contact Bookshelf
Madame Curie Bioscience Database
(navigation arrows) Go to previous chapter Go to next chapter Go to top of this page Go to bottom of this page Go to Table of Contents