Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Comput Chem. Author manuscript; available in PMC Aug 13, 2009.
Published in final edited form as:
PMCID: PMC2726574
NIHMSID: NIHMS115557

Q-Dock: Low-resolution flexible ligand docking with pocket-specific threading restraints

Abstract

The rapidly growing number of theoretically predicted protein structures requires robust methods that can utilize low-quality receptor structures as targets for ligand docking. Typically, docking accuracy falls off dramatically when apo or modeled receptors are used in docking experiments. Low-resolution ligand docking techniques have been developed to deal with structural inaccuracies in predicted receptor models. In this spirit, we describe the development and optimization of a knowledge-based potential implemented in Q-Dock, a low-resolution flexible ligand docking approach. Self-docking experiments using crystal structures reveals satisfactory accuracy, comparable with all-atom docking. All-atom models reconstructed from Q-Dock’s low-resolution models can be further refined by even a simple all-atom energy minimization. In decoy-docking against distorted receptor models with a root-mean-square deviation, RMSD, from native of ~3 Å, Q-Dock recovers on average 15–20% more specific contacts and 25–35% more binding residues than all-atom methods. To further improve docking accuracy against low-quality protein models, we propose a pocket-specific protein-ligand interaction potential derived from weakly homologous threading holo-templates. The success rate of Q-Dock employing a pocket-specific potential is 6.3 times higher than that previously reported for the Dolores method, another low-resolution docking approach.

Keywords: Q-Dock, ligand docking, low-resolution docking, pocket-specific potential, protein models, threading

Introduction

Computational modeling of protein-ligand interactions is of great importance in modern structural biology and has many applications in investigating fundamental biochemical processes and in the development of new pharmaceutical compounds 13. During the past years, a number of diverse algorithms for docking small molecules into receptor proteins have been developed 47 and evaluated in terms of docking accuracy and the ability to predict binding affinities 810. In general, docking algorithms seek to identify the lowest free energy position of a ligand in the binding site of the receptor protein. These algorithms are designed to reproduce the experimentally given structure of a receptor protein complexed with a ligand and to rank all generated solutions such that the conformation closest to the experimental structure appears as the top model. There are two key elements of a docking approach: First, a scoring function is required that accurately ranks the generated set of solutions. Second, a fast and effective search algorithm is necessary to explore the conformational space of protein-ligand interactions. Search efficiency is particularly important in virtual screening experiments 11,12 that require many thousands of possible ligands to be docked into a receptor structure in an acceptable amount of time, usually no more than few minutes per ligand.

Docking programs typically utilize high-resolution receptor structures determined by experiment or theoretical modeling 1315. Virtual screening reveals that the success of the docking calculation typically depends on the quality of the receptor structure with the success rate decreasing from ligand-bound to ligand-free to modeled structures 16. This drop off is correlated with the degree of protein movement in the active site; protein active site rearrangements greater than 1.5 Å lead to almost complete lack of recovery of the “true” binding mode 17. Furthermore, decoy-docking experiments using deformed trypsin structures with a Cα RMSD varying from 1 to 3 Å as targets for docking of 47 ligands experimentally known to bind to trypsin revealed that the specific native contacts between the ligands and their receptor structures are rapidly lost with the deformation of the receptor structure 18.

On the other hand, protein models can now be routinely determined by high-throughput modeling procedures for entire proteomes. Many of the protein structures generated by structure prediction algorithms appear as attractive targets for the development of biologically active compounds 19. As demonstrated by CASP7, the quality of theoretical methods for protein tertiary structure prediction has improved, and in many cases, predicted models are comparable to low-resolution experimental structures 20,21. Nonetheless, these models have significant structural inaccuracies in side-chain and backbone coordinates when compared to ligand-bound, experimentally solved structures. An estimated one half of weakly homologous protein models have a RMSD from the native binding site >2 Å 22.

A variety of different docking techniques have been developed to address this problem. Most account for receptor flexibility by docking ligands against a precalculated ensemble of receptor conformations 23 or by softening the criterion for the steric fit between the ligand and receptor 24. To overcome the limitation of computationally expensive modeling of macromolecules, the Flexibility Tree combining a variety of efficient motion descriptors has been recently developed and implemented in the FLIP-Dock program 25. Other docking techniques capable of dealing with significant structural inaccuracies employ a low-resolution representation of the protein. It has been shown that a ultra low (~7 Å resolution) representation of molecular structure averages all high-resolution structural details and dramatically improves the tolerance to receptor structure deformation 26. A similar approach used to dock small molecules into low-resolution models demonstrated that even low-quality receptor structures could be efficiently utilized in docking experiments 27. Nevertheless, most low-resolution docking approaches neglect ligand flexibility.

The desire to improve the state-of-the-art motivated us to develop Q-Dock, an approach that effectively utilizes low-quality protein structures as targets for flexible ligand docking. Q-Dock describes both the ligand and the protein in a reduced representation. Ligand flexibility is accounted for by docking an ensemble of precalculated discrete ligand conformations with Monte Carlo Replica Exchange (REMC) used to optimize the binding mode of the ligand in the binding site of the rigid receptor protein. Here, we describe the development and optimization of a coarse-grained knowledge-based potential implemented in Q-Dock. The performance of Q-Dock is compared with several popular all-atom programs for flexible ligand docking in a self-docking experiment using the crystal structures of target receptors. Next, we evaluated the efficiency of Q-Dock in a decoy-docking study against a set of distorted receptor structures whose Cα RMSD from the crystal structure ranges from 1–3 Å. Finally, with regards towards improving the quality of ligand-receptor pose predictions, we take full advantage of pocket-specific potentials derived from weakly homologous threading templates and apply them to the docking of ligands against modeled receptor structures.

Methods

Dataset

The structures of protein-ligand complexes were selected from the Protein Data Bank 28 according to the following criteria: Protein structures determined by X-Ray crystallography to a resolution ≤2.5 Å and that have at least 50 residues were chosen. Organic molecules, cofactors, single nucleotides and short peptides composed of standard or modified amino acids were considered as ligands if the number of predefined functional groups (listed in Table 1) was ≥5 and ≤25. To exclude non-specific ligand interactions, a minimum number of 5 residues in contact with the ligand atoms is imposed. Interatomic contacts are calculated by LPC 29 that defines contacts based on an analysis of interatomic surfaces. Structures containing two or more ligands within 9 Å of each other were rejected. Subsequently, the complexes were subjected to a clustering procedure that uses a cutoff of 35% amino acid sequence identity between clusters. Two homologous proteins (members of one cluster) were accepted into the dataset only if the Tanimoto coefficient 30, TC, calculated for their ligands was below 0.5. A high TC (typically 0.7 – 1.0) is indicative of very high chemical similarity. In this manner, a dataset of 1636 complexes was created, which can be considered as non-redundant with respect to protein-ligand interactions. This dataset was then divided into two sets: a training set of 818 complexes used to derive the statistical potential and then to optimize force field parameters and weights and a benchmark set of 818 structures used exclusively to assess the derived potentials. Training set proteins with a sequence identity ≥35% to any of the 34 targets used in the docking experiment (described below) were exchanged with randomly selected benchmark proteins so that no proteins with ≥35% sequence identity to any docking target are used to derive and optimize the force field parameters.

Table 1
Predefined chemical groups used to decompose ligands into quasichemical building blocks.

The performance of Q-Dock was evaluated in a self-docking experiment for the set of protein-ligand complexes for which comparative assessments of several programs for all-atom flexible molecular docking were reported 3133. From the original dataset we removed three structures of cytochrome P-450 that contain two ligands in the binding pocket. The resulting set consists of 34 protein-ligand complexes (PDB codes: 1abe, 1abf, 1apt, 1apu, 1cbx, 1cil, 1cnx, 1etr, 1ets, 1ett, 1gsp, 1icm, 1icn, 1nnb, 1nsc, 1nsd, 1okl, 1pph, 1rhl, 1rls, 1tng, 1tni, 1tnj, 1tnk, 1tnl, 1tpp, 2ifb, 3cpa, 3ptb, 3tmn, 5abp, 5tln, 6cpa, 6tmn).

Next, we used Q-Dock in a decoy-docking study against distorted receptor structures. The decoy dataset consists of 291 models of trypsin of which 93, 101 and 97 structures have a Cα RMSD from the crystal structure of 1, 2 and 3 ±0.5 Å, respectively. The distorted receptor models were used as targets for docking 47 ligands co-crystallized with trypsin. Details concerning the preparation of distorted models of trypsin and ligand selection are presented elsewhere 18. We compared the results of Q-Dock decoy-docking with the results reported for all-atom docking by Kim and Skolnick 18.

Finally, the performance of Q-Dock was evaluated for weakly homologous protein models used as targets for docking flexible ligands. From the set of 318 proteins, for which the results of the Dolores method were reported 27, we selected 206 proteins up to 300 residues in length. Protein structure modeling consisted of template identification followed by an assembly/refinement procedure. First, for each target protein weakly homologous structure templates were selected from a non-redundant PDB library by our threading algorithm PROSPECTOR_3 34,35, which was designed to identify analogous as well as homologous templates. We note that only threading templates with a sequence similarity to the target protein <35% were used in the modeling procedure. Subsequently, threading templates were submitted to TASSER 3638, a coarse-grained template assembly/refinement procedure guided by tertiary restraints extracted from threading templates. Weakly homologous protein models were then taken as targets for the prediction of ligand binding sites using FINDSITE, a method that identifies ligand-binding sites based on binding site similarity among superimposed groups of template structures identified from threading 22. Ligand-binding sites predicted by FINDSITE were used to extract pocket-specific protein-ligand restraints from the threading templates to support low-resolution docking of flexible ligands into the theoretical receptor structures using Q-Dock.

Q-Dock force field

To quantitatively describe protein-ligand interactions, a combined knowledge-based potential was derived from the regularities observed in training protein-ligand complexes. The generic part of the force field (EGEN) consists of four energy terms that account for different energetic contributions. ECP (contact potential) accounts for the attractive and repulsive interactions between protein residues and ligand functional groups, i.e. it favors a specific orientation of a small molecule in the binding pocket. The surface-dependent terms E SL and E SP are in general less specific, scaled to the portion of the accessible solvent area of ligand functional groups and binding pocket residues that become buried upon complex formation. The differences in the accessible solvent area in the complexed and fully solvated states are used to express the burial likelihood for ligand functional groups and binding pocket residues. Moreover, we include a bias to the expected number of contacts, ECN (spatially neighboring residues), for ligand functional groups. Finally, to ensure the best native-like recognition capability, the force field parameters were optimized against the ensemble of ligand decoys and the energy terms were combined with optimized weight factors.

Reduced model of protein-ligand complexes

A knowledge-based potential implemented in Q-Dock was developed for simplified models of ligands and receptor proteins. We employed the following coarse-grained representation of protein-ligand complexes: Protein residues are represented by Cα atoms and single points at their side-chain centers of mass. For glycine residues, only the Cα positions are used. Ligand molecules are first decomposed into 17 chemical groups, which are listed in Table 1. A single effective point is then placed at the center of mass of each group. Since conformational space for protein-ligand interactions is defined continuously, a repulsive potential is essential to account for the volume exclusion among a ligand and a protein. We defined two repulsion shells: a ligand group – side-chain repulsion shell SijR and a ligand group – backbone repulsion shell BjR. The pair-specific repulsive shell SijR was defined as the minimum distance between effective points of the side chain center of mass of amino acid i and ligand functional group j. For each effective ligand point, a backbone repulsion shell BjR is defined as the minimum observed distance from any Cα atom in crystal structures of protein-ligand complexes. The excluded volume between units is approximated by a strong energy penalty when the distance between them is below the cutoff values of SijR or BjR.

Ligand – side-chain contact potential

For each pair of amino acid i and ligand functional group j, a unique contact shell was defined. The limiting values for the pair-specific SijC were calculated for the protein-ligand complexes present in the dataset using the Matthew’s correlation coefficient, MCC:

MCC=TP×TNFP×FN(TP+FP)(TP+FN)(TN+FP)(TN+FN)
Eq. 1

where TP and TN is the number of true positives and true negatives and FP and FN is the number of false positives and false negatives, respectively. TP, TN, FP and FN were obtained by comparison to the interatomic interactions calculated for all-atom models. A residue and a ligand functional group are defined to be in contact if any of their heavy atoms were found to be in contact as reported by LPC algorithm 29 which is based on the inter-atomic contact surface analysis. For each pair of effective points i and j, a pair-specific contact shell SijC is determined by a distance cutoff that maximizes MCC.

The limiting distances were subsequently used to extract the observed number of contacts between a given pair of amino acid i and ligand functional group j in the training set of protein-ligand complexes (nij). The observed number of contacts is then compared to that expected in a reference state where there are no specific interactions:

nij0=N×xi×xj
Eq. 2

where nij0 is the expected number of contacts between amino acid i and ligand functional group j, N is the total number of contacts between any pair of protein-ligand effective points, and xi and xj are the mole fraction of units i and j in the training set, respectively. For protein residues, the mole fractions are calculated with respect to surface residues only. A surface residue is defined having ≥30% of its total surface exposed. We used POPS-A 39 for the solvent accessible area calculations.

The potential of mean force PC between amino acid i and ligand functional group j is simply given by:

PijC=lnnijnij0
Eq. 3

Non-polar surface-dependent potential

The change in a solvent accessible area upon complex formation is accounted for as a surface-dependent potential. The non-polar surface-dependent potential PS is based on the differences in the accessible solvent area of a ligand functional group or a protein residue in the complexed and fully solvated states 40:

PiS=lngi(ASAC)gi(ASAS)
Eq. 4

where gi is the probability distribution of the solvent accessible area attached to unit i in the complexed state (ASAC) compared to the solvated state (ASAS). The distribution function g is calculated for ligand groups and amino acids by a statistical analysis of the protein-ligand complexes present in the training set. For proteins, only binding pocket residues are taken into consideration. The solvent accessible area of coarse-grained models of both ligands and proteins was approximated by the modified method of Wodak and Janin 41,42 (the details are given in the Appendix).

Contact number

A bias to the expected number of neighboring residues for each ligand functional group is incorporated into the force field as

ECN=j=1LNjNj0
Eq. 5

where L is the total number of effective points in the ligand molecule, Nj is the observed number of contacting residues (calculated using the pair-specific contact shell SijC) and Nj0 is the expected number of neighbors (the mean value calculated for protein-ligand complexes in the training set).

Generation of decoys

The energy parameters as well as the energy weight factors were optimized against an ensemble of decoy conformations. For each protein-ligand complex, an ensemble of non-redundant flexible decoys was constructed as follows: In the first step, 109 ligand orientations were created. A sphere of 7 Å radius centered on the center of mass of the ligand in the native conformation was imposed, such that if a ligand molecule leaves the sphere it will enter through the opposite side. Subsequently, the number of ligand variations was reduced by using hard-sphere steric potentials SijR and BjR to account for volume exclusion between the ligand and the protein. To avoid the overaccumulation of some ligand orientations, a pairwise position similarity cutoff was used to ensure that the RMSD of any pair of decoys is larger than 3.5 Å. In addition, for each twenty non-native decoys (RMSD from native >3.5 Å), one native-like conformation (RMSD from native ≤3.5 Å) was generated and included into the decoy ensemble to account for the ligand distribution around the native position.

Parameter optimization

Similarly to Genetic Algorithms, Evolution Strategies (ESs) are algorithms which imitate the principles of natural evolution as a method to solve parameter optimization problems 43,44. ESs are random strategies, and as such are particularly robust and cope well with a large number of variables, or rugged objective functions. We employed the ES algorithm to improve the native-like recognition capability by the optimization of the force field parameters against the ensemble of ligand decoys. For each energy term, its parameters were optimized independently using the values derived from the statistical analysis as the initial set. The objective function to minimize (G) was the combination of the correlation between the energy function and the RMSD from the native ligand position (CC), the Z-score (the dimensionless ratio of the first and second moments of energy distribution within the native-like pool and the decoy pool) and the B-score (the fraction of decoys with an energy higher than that of at least one native-like conformation):

G=11+1Np=1NCCp×11+1Np=1NZscorep×11+1Np=1NBscorep
Eq. 6

where N is the total number of training protein-ligand complexes, and CC p, Z – score p, B – score p are the coefficients calculated for a complex p.

Weight optimization

It was already shown for reduced protein models that the combined energy with optimized weight factors has higher correlation coefficients and native-like recognition ability than a naive combination of energy terms (all the weight factors set to 1) and each of the single energy terms alone 45. We used this observation to optimize the energy weight factors. The optimization was done using the CERN MINUIT package 46. Similar to the optimization of force field parameters, this procedure minimizes the objective function G as defined by Eq. 6.

Ligand move set

We allow rotational and translational freedom of a small molecule within a restricted area of the receptor protein. A spherical distribution is sampled to generate random vectors, located on a spherical surface. To speed up the conformational space sampling, their normalized components 47 (v2=x12+x22++x62=1) are used as the scaling factors of the translational (1.0 Å) and rotational (10 deg) steps of a random walk. For each protein-ligand complex, we also allow for the random perturbation of the ligand’s internal conformation sampled according to a uniform distribution.

Similar to other docking algorithms that employ a pre-docking generation of multiple ligand conformations 5,48, ligand flexibility in Q-Dock is accounted for by docking an ensemble of ligand discrete conformations into the receptor protein. First, the set of conformations is generated for the all-atom ligand representations using the torsion angles as the degrees of freedom. Torsion angles are identified with the aid of the Autotors program available from AutoDock 6,49. The number of states for each ligand dihedral angle depends on the hybridization of the linked atoms: three states (60 deg, 180 deg, 300 deg) are considered for two sp3 hybridized atoms, two states (0 deg, 180 deg) for two sp2 hybridized atoms and 12 states (starting from 0 deg with 30 deg step) for all other combinations 5. Conformations with steric clashes (when the distance between two non-bonded atoms <2 Å) are excluded. Moreover, a structural similarity cutoff is imposed to ensure that any two ligand conformations in the ensemble have a RMSD >1 Å. Subsequently, all-atom ligand representations are decomposed into 17 chemical groups, see Table 1, and a single effective point is placed at the center of mass of each group.

Energy minimization

For a reasonable force field, a ligand native pose should appear as the lowest energy conformation. To determine the deviation of the lowest energy pose from experiment, we performed simple low-resolution energy minimization using the Simplex method 50 starting from the crystal structures. Energy minimization was carried out for training as well as benchmark protein-ligand complexes using the statistical and optimized sets of parameters with optimized energy weight factors.

Binding mode optimization (docking)

To efficiently explore the conformational space in docking simulations, we used Replica Exchange Monte Carlo (REMC) 5153. The temperature range was chosen such that at the lowest temperature a protein-ligand complex is stable in the native structure, whereas at the highest temperature, a ligand freely explores conformational space. A 7 Å radius sphere is imposed to prevent the ligand molecule from moving too far from the binding site in the high temperature replicas. Q-Dock utilizes 16 replicas where each is created by randomly choosing the position of a ligand in the vicinity of the binding pocket. The simulations consist of 100 attempts at replica exchange and 100 MC steps between replica swaps. The lowest energy ligand conformation identified in all replica trajectories is taken as the final model.

Pocket-specific protein-ligand potential

To improve docking accuracy particularly against low-quality protein models, we incorporated into the force field a pocket-specific protein-ligand interaction potential that is derived from weakly homologous (<35% sequence identity to a target protein) threading holo-templates. First, structure templates are identified by the threading algorithm PROSPECTOR_3 34,35 and used to predict ligand-binding sites and binding residues by recently developed FINDSITE algorithm 22. A short overview of FINDSITE is provided in the Appendix. To derive a pocket-specific protein-ligand interaction potential, we used binding pockets predicted for each target protein by FINDSITE. Protein-ligand contacts are calculated for all threading templates that share a top-ranked predicted binding site. These are used to extract the observed number of contacts between a binding residue corresponding to position k in the target sequence (the chemical properties of binding residues are ignored) and ligand functional group of type j. Subsequently, the expected number of contacts in a reference state is calculated as in Eq. 2. Then, a pocket-specific potential of mean force E PS between a binding residue at position k in the target sequence and a ligand functional group of type j is given by Eq. 3, but now averaged over the FINDSITE identified ligands and functional groups. The total energy now becomes the sum of weighted generic energy terms (EGEN) and the pocket-specific energy (EPS):

ETOT=EGEN+wPSEPS
Eq. 7

The weight wPS was optimized using the objective function G (Eq. 6) over the subset of 426 proteins ≤400 residues in length selected from the training set. During the optimization of wPS, the generic weights were kept fixed at previously optimized values. Native-like recognition capability was then separately assessed for the subsets of proteins ≤400 residues selected from the training (426 cases) and benchmark complexes (400 cases).

Reconstruction of all-atom models and simple high-resolution refinement

The final models obtained from Q-Dock simulations can be easily transformed into their all-atom representation. Reconstruction consists of the translation and rotation of all-atom ligand structures and the adjustment of dihedral angles so that the centers of mass of the functional groups overlap exactly with those predicted by the low-resolution docking simulation. The rebuilt protein-ligand complexes are subsequently refined by a simple energy minimization procedure using Amber8 54 with the all-atom force field ff03 55 used for proteins in conjunction with the general Amber force field 56, GAFF, for ligand molecules. Hydrogen atoms are added by the Open Babel package 57. To speed up ligand parameterization, partial charges on ligands atoms were approximated by the Gasteiger-Marsili 58 formalism. A Coulombic potential on a 1 Å grid was calculated by LEaP (Amber8) in order to place chloride or sodium ions at positions of the highest or lowest electrostatic potential around a protein-ligand complex to neutralize it. Long-range non-bonded interactions were truncated using a 12 Å cutoff (electrostatic and vdW). The protein was kept fixed during the simulation, whereas the conformation of the ligand is energy minimized in 250 cycles of steepest-descent followed by 250 cycles of a conjugate gradient procedure.

Results

Ligand – side-chain contact potential

The averaged interactions between ligand functional groups and surface residues in the non-redundant library of 818 training protein-ligand complexes were used as a reference state for the calculation of a log odds potential that expresses the likelihood of interaction between ligand groups and protein residues. The average value of the MCC (defined below in Eq. 1) for contacts using the reduced representation as compared to a detailed atomic model is 0.8, which suggests that the extracted contacts between effective points in reduced models reproduce well the real contacts between ligands and receptor proteins observed in all-atom structures. In general, favorable and unfavorable interactions between amino acids and ligand functional groups are found to be consistent with their physicochemical properties.

Non-polar surface-dependent potential

Solvent effects are accounted for as a non-polar surface-dependent potential. We observed that a very small portion of hydrophobic groups surface remain solvent accessible, rather, the complete burial of hydrophobic groups is strongly favorable. Simultaneously, a “partially” buried state is favorable for most hydrophilic groups. The optimization procedure significantly enhances the preferences of polar and non-polar functional groups. Similar characteristics are observed for binding pocket residues.

Contact number

For the statistically derived set of parameters, the contact number simply expresses the average number of neighboring residues calculated for training protein-ligand complexes. The optimization procedure caused a significant increase in the expected contact number and corresponds to a strong penalty for ligand conformations that partially form a complex with the receptor protein (characterized by fewer contacts compared to the native conformation).

Minimization of native complexes

An accepted quality measure for the results of docking small molecules into the receptor proteins is the root-mean-square deviation, RMSD, from the ligand position in the complex crystal structure 3133. As a consequence of the imperfections of the force field as well as experimental deficiencies affecting reference conformations, often the energy minimum does not exactly correspond to the native conformation 40. Nevertheless, for a reasonable force field, the lowest energy pose of a ligand should not deviate substantially from the native conformation. To ascertain the deviation from experiment when Q-Dock’s force field is used, we performed a simple energy minimization, starting from the crystal structure. The simulations were carried out separately for each force field parameter set using the optimized weights factors for energy terms. The results obtained for the training and the benchmark set are shown in Figure 1. The minimization procedure slightly shifted down the central tendency of energy (Figure 1A) and causes an acceptable deviation from the crystal structures. In most cases, the lowest energy ligand positions do not deviate by more than 2.0 Å from the experimental structure (Figure 1B) and preserve >90% the of the native protein-ligand contacts (Figure 1C).

Figure 1
Distribution of the combined generic energy EGEN (A), RMSD from the native structure (B) and the fraction of preserved native contacts (C) for minimized low-resolution models of training (top panel) and benchmark (bottom panel) protein-ligand complexes. ...

Native-like recognition capability

The quality of the native-like discriminatory power of Q-Dock was assessed by the correlation between the energy and RMSD from the native ligand position (CC), the relative energy gap between native-like structures and the ensemble of non-native decoys (Z –score), and the fraction of decoys with an energy higher than at least one native-like structure (B – score). The summary of native-like recognition capability is presented in Table 2. The parameters optimized on ligand decoys exhibit considerably higher discriminatory power than the statistically derived potential. Furthermore, the optimization of weight factors improved native-like recognition capability. Finally, the slight difference between the coefficients calculated for the training and benchmarking set excludes possible specificity toward the training complexes. Thus, in all subsequent calculations, only results for the optimized parameters are reported.

Table 2
Summary of Q-Dock’s native-like recognition capability for a large set of ligand decoys.

Weight optimization for pocket-specific restraints

To further improve docking accuracy against the crystal structures as well as low-quality predicted receptor structures, a pocket-specific protein-ligand interaction potential (EPS) was derived from weakly homologous (<35% sequence identity to a target protein) threading holo-templates and combined with the generic potential derived from the regularities observed in crystal structures of the training complexes (see Eq. 7). The value of wPS = 3.1 was found to maximize the native-like recognition capability (see Table 2).

Docking results for receptor crystal structures

The performance of Q-Dock was evaluated for 34 protein-ligand complexes for which comparative assessments of all-atom algorithms for flexible ligand docking were reported 3133. The crystal structures of proteins were taken as targets for flexible ligand docking using the optimized generic parameters set (EGEN) as well as the generic potential combined with the pocket-specific threading restraints (ETOT). No proteins with >35% sequence identity to targets are in the training dataset. The top docked solutions obtained from Q-Dock simulations were transformed into their all-atom representations and refined by a simple energy minimization in an all-atom force field.

The results of docking simulations evaluated by the RMSD from the crystal structure are presented in Table 3. The overall performance of Q-Dock is comparable to many all-atom approaches; the average RMSD calculated for all-atom models reconstructed from top-ranked docked conformations obtained using EGEN and ETOT is 3.90 and 3.03 Å, respectively. In Figure 2, we show the examples of the energy versus the RMSD correlation for neuraminidase (Figure 2A, PDB-ID: 1nsc), intestinal fatty acid binding protein (Figure 2B, PDB-ID: 1icm), ribonuclease T1 (Figure 2C, PDB-ID: 1rhl) and thermolysin (Figure 2D, PDB-ID: 3tmn). With the pocket-specific restraints (ETOT), the global minimum is frequently closer to the native ligand pose. Moreover, the higher correlation between the energy and RMSD speeds up the convergence of the binding mode optimization.

Figure 2Figure 2Figure 2Figure 2
Energy plotted as a function of RMSD for REMC trajectories collected for neuraminidase (A), intestinal fatty acid binding protein (B), ribonuclease T1 (C) and thermolysin (D). The simulations were carried out using the optimized generic parameters set ...
Table 3
Comparison of RMSD values for the top models from all-atom and coarse-grained flexible ligand docking. In Q-Dock simulations we employed the generic part of the force field (EGEN) as well as the generic potential combined with pocket-specific threading ...

Furthermore, we found that the high-resolution refinement improved the quality of the final models reconstructed from low-resolution images provided by Q-Dock. This is particularly pronounced for already well-docked solutions. Indeed, the vast majority of models with a RMSD <3.5 Å move towards the native pose. Next, we assessed the ability to select the native-like ligand conformation from the ensemble used to mimic ligand flexibility. Interestingly, native-like ligand conformers are often observed in the top docked solutions, even if the internal ligand energy was not evaluated and no energy minimization was applied. The average internal ligand RMSD from the native conformation calculated for the models reconstructed from top-ranked Q-Dock solutions obtained using EGEN and ETOT is 0.75 and 0.62 Å, respectively, whereas the average RMSD calculated for all conformations present in the ligand ensembles is 1.80 Å.

Examples of successful all-atom refinement are shown for neuraminidase and carboxypeptidase A in Figure 3. For neuraminidase (Figure 3A, PDB-ID: 1nnb), the RMSD of the inhibitor 2-deoxy 2,3-dehydro-N-acetyl neuraminic acid rebuilt from low-resolution Q-Dock’s top model is 3.99 Å. The final RMSD calculated for the inhibitor after all-atom refinement is 2.13 Å. The lowest energy pose of a phosphonate in the active site of carboxypeptidase A (Figure 3B, PDB-ID: 6cpa) reported by Q-Dock corresponds to a RMSD of 2.84 Å. High-resolution refinement shifted the ligand toward the native pose with a final RMSD of 1.18 Å.

Figure 3
Examples of a high-resolution refinement for neuraminidase (A) and carboxypeptidase A (B). Low-resolution images representing Q-Dock top-ranked solutions, all-atom models rebuilt from coarse-grained models and refined structures are presented in left, ...

Docking results for deformed receptor structures

In this experiment, we used 291 distorted models of bovine trypsin with a 1, 2 and 3 ±0.5 Å Cα RMSD from the crystal structure as targets for low-resolution flexible ligand docking using Q-Dock. 47 different ligands known experimentally to bind to trypsin were docked into each deformed receptor structure. No high-resolution refinement was applied. The results were compared to those reported for all-atom decoy-docking using AutoDock and FlexX 18. Figure 4 presents the accuracy of flexible ligand docking against the set of distorted receptor structures in terms of the fraction of correctly predicted specific native contacts and binding residues (non-specific contacts). We find that Q-Dock is far less sensitive to the deformation of the receptor protein than all-atom approaches. The average fraction of binding residues predicted by AutoDock, FlexX and Q-Dock for 1/2/3 Å RMSD decoys is 0.85/0.57/0.36, 0.70/0.43/0.26 and 0.87/0.68/0.62, respectively. Moreover, the average fraction of specific native contacts recovered by AutoDock, FlexX and Q-Dock for 1/2/3 Å RMSD decoys is 0.75/0.46/0.27, 0.59/0.33/0.19 and 0.62/0.47/0.42, respectively. In the case of the most distorted receptor structures (Cα RMSD of 3 ±0.5 Å), Q-Dock was capable to predict on average 25–35% more binding residues and 15–20% more specific native contacts than all-atom approaches.

Figure 4Figure 4Figure 4
Comparison of the flexible ligand docking results for 47 different ligands and deformed structures of trypsin with Cα RMSD of 1, 2 and 3 ±0.5 Å obtained using AutoDock (A), FlexX (B) and Q-Dock (C). Top and bottom plots show the ...

Docking results for receptor models

The weakly homologous protein models used in this study were generated by a threading-based protein structure prediction procedure that consists of structure template identification by PROSPECTOR_3 34,35 followed by template assembly/refinement using TASSER 3638. Subsequently, the modeled protein structures were submitted to ligand-binding site prediction using a recently developed FINDSITE algorithm that can accurately identify binding sites in experimentally solved protein structures as well as in approximate, theoretical models 22. Here, FINDSITE predictions were used to derive a pocket-specific potential for each target protein. We provided Q-Dock with the modeled receptor structures, predicted binding sites and pocket-specific restraints and carried out flexible ligand docking simulations employing (ETOT) (Eq. 7) as an objective function in ligand binding mode optimization and the selection of final models. Furthermore, to evaluate the improvement of docking accuracy against low-to-moderate quality protein models resulting from including pocket-specific restraints, we performed simulations using (EGEN) (optimized generic energy terms only) instead of (ETOT). The performance of Q-Dock was then compared with the results reported for Dolores which is another low-resolution approach that docks rigid ligand structures into receptor proteins 27. The results obtained for the set of 206 target proteins evaluated in terms of the fraction of predicted specific protein-ligand contacts as well as the fraction of recovered binding residues are shown in Figure 5.

Figure 5
Fraction of predicted specific and non-specific (binding residues) native contacts identified by Dolores method and Q-Dock using weakly homologous protein models as targets for docking small ligands. Flexible ligand docking simulations by Q-Dock were ...

We note the higher accuracy of ligand-binding site prediction using FINDSITE compared to the grid-based method implemented in Dolores 27; the fraction of proteins with at least one native specific contact is 0.73 for Dolores and is 0.92 and 0.93 for Q-Dock employing EGEN and ETOT, respectively. In general, Q-Dock predicts considerably more specific protein-ligand contacts than Dolores, especially if pocket-specific restraints are applied (Figure 6, circles). For example, the fraction of proteins with ≥50% of recovered specific native contacts is 0.07, 0.30 and 0.46 for Dolores and Q-Dock employing EGEN and ETOT, respectively. Interestingly, the fraction of predicted binding residues depends entirely on the accuracy of ligand-binding site prediction and not the presence of pocket-specific restraints (Figure 5, squares). The restraints support the recovering of specific contacts as the result of the improved ability to predict a “true” ligand binding mode in the putative binding site of the receptor model.

Discussion

Despite progress in protein structure prediction, theoretical protein models frequently have structural inaccuracies in their side-chain and backbone coordinates when compared to experimentally determined structures. Since all-atom docking approaches were found to be highly sensitive to the structural distortions of the ligand binding region 16,17,59, they are inapplicable to such models. This deficiency has motivated the development of protocols capable of docking small molecules into the structurally distorted ligand-binding sites using low-resolution docking techniques 26,27,60,61. In this spirit, we have developed Q-Dock, an approach that effectively utilizes low-quality protein structures as targets for flexible ligand docking. The force field implemented in Q-Dock combines two classes of energy terms: generic knowledge-based potentials derived from the regularities observed in crystal protein-ligand complexes and pocket-specific potentials extracted for each target protein from ligand-bound forms of weakly homologous structure templates. The combined knowledge-based potential implemented in Q-Dock was derived from the statistics of crystal protein-ligand complexes and further optimized to increase the native-like recognition capability. The resulting potentials for low-resolution modeling of protein-ligand interactions seem to make good physical sense; they can be rationalized in terms of fundamental ligand-protein interactions including ionic interactions, hydrogen bonds, aromatic stacking or hydrophobic interactions.

Self-docking utilizing crystal structures of receptor proteins as targets for flexible ligand docking revealed that the accuracy of Q-Dock is comparable to all-atom approaches; in most cases, the native-like structures appear as the lowest-energy conformations. Furthermore, the low-resolution models can be transformed back into their all-atom representations and efficiently refined even by a simple all-atom minimization. For the vast majority of reasonably well-docked conformations reported by Q-Dock, the high-resolution refinement procedure considerably improved the quality of final models. Thus, low-resolution modeling serves as a valuable initial step for a more detailed structural analysis, as well as a complement to experimental and computational data obtained by other techniques 26,62. Moreover, the results obtained by docking of the ensemble of discrete ligand conformations into receptor proteins shows that ligand flexibility can be successfully included in low-resolution docking. Despite the fact the ligand internal energy was ignored, native-like ligand conformers were frequently observed in top docked solutions.

The main practical advantage of a coarse-grained docking methodology, such as Q-Dock, is the possibility of utilizing low-quality receptor structures routinely produced by proteome-scale protein structure modeling projects. Our decoy-docking study of flexible ligands against the distorted receptor models revealed that Q-Dock recovers on average 25–35% more binding residues and 15–20% more specific native contacts than all-atom approaches. In more than 90% of the cases, at least one ligand-binding residue was correctly predicted. Moreover, in almost one-third of the cases, the fraction of recovered specific protein-ligand contacts was ≥50%.

To full advantage of predicted binding regions, we proposed a pocket-specific protein-ligand interaction potential derived from weakly homologous structure templates selected by threading that can be used as valuable supplementary restraints in ligand docking against low-quality receptor structures. This yields a 6.3 times higher success rate of Q-Dock compared to the previously published Dolores method 27.

The tolerance to structural inaccuracies in receptor models clearly enhances the importance of protein models as reliable targets for virtual screening or structure-based drug design. Q-Dock represents a practical tool for utilizing the rapidly growing number of theoretically predicted protein structures in experiments that require an effective flexible ligand docking procedure.

Acknowledgments

This research was supported by grant No. GM-37408 and GM-48835 of the Division of General Medical Sciences of the National Institutes of Health. We gratefully acknowledge Dr RyangGuk Kim for providing the set of deformed receptor structures and all-atom docking results.

Appendix

Solvent accessible surface estimation

To calculate the accessible surface area (ASA), we used an analytical approximation approach adapted from Wodak and Janin 42. This fast and reliable analytical model expresses the ASA as a function of interatomic distances only, and works at both the atomic and residue levels. It has been shown that when it is applied to simplified models of proteins 41,63 and nucleic acids 63, it reproduces the surface area calculated by accurate all-atom algorithms. The total solvent accessible area of a molecule is expressed as the sum of the ASA attached to all of its atoms:

ASA=i=1NAi
Eq.App. 1

For a given atom i, the following expression can be applied to account for the intersecting spheres of the neighboring atoms:

Ai=SiiiN(1pipijbij(rij)Si)
Eq.App. 2

where Si is the accessible solvent area of isolated atom i, bij (rij) is the area cut out by the overlap of the atom j at a distance rij = |rirj|, and pi, pij are the empirical correction factors.

The ASA of isolated atom i with radius Ri can be calculated using a solvent probe with radius Rsp (usually equal to 1.4 Å 64) as follows:

Si=4π(Ri+Rsp)
Eq.App. 3

The area cut out of Ai by atom j can be calculated from

bij(rij)={π(Ri+Rsp)(Ri+Rj+2Rsprij)(1+RjRirij)ifrij<Ri+Rj+2Rsp0otherwise
Eq.App. 4

where Ri and Rj are the radii of atom i and j, respectively.

Originally, the method was tested for the all-atom as well as reduced representations of protein structures considering Cleft angle bracket atoms only. In our approach, the surface area is estimated based on the positions of the Cleft angle bracket atoms and centers of mass of residue side chains and ligand functional groups. The initial full set of parameters for the 20 amino acids (radii R, empirical correction factors pi and pij ) were taken from Cavallo and co-workers 39. The radii for ligand functional groups were obtained by statistical analysis of isolated ligand functional groups present in the set of protein-ligand complexes used in this study:

Ri=Si4π
Eq.App. 5

where Ri is the estimated radius of ligand group i and left angle bracket Si right angle bracket is the average surface of the isolated group i, as calculated for all-atom models by LPC 29.

Subsequently, the initial set of parameters for protein residues and ligand functional groups was submitted to an optimization procedure to minimize the variance of ASA calculated for reduced models of protein-ligand complexes from the ASA calculated for their all-atom models by POPS-A 39 and LPC 29. Since in our model it is the residues in contact with a ligand that are important for protein-ligand interactions, the parameters for proteins were optimized over binding pocket residues only. For the optimized set of parameters, the accessible surface area calculated for all-atom models is reproduced by this coarse-grained method, considering ligands and proteins individually as well as in the complexed state with an average correlation coefficient of 0.94. The approximation of accessible surface area seems to be well suited for the practical use in low-resolution docking simulations using Q-Dock.

Prediction of ligand binding sites by FINDSITE

To predict ligand-binding sites in protein models and to derive a pocket-specific potential for protein-ligand interactions, we used the recently developed FINDSITE approach that detects ligand-binding sites based on the binding site similarity across superimposed groups of threading templates 22. FINDSITE not only works well for crystal structures but also exhibits a good tolerance to structural inaccuracies in modeled protein structures (up to a global backbone RMSD from the crystal structure of 8–10 Å); thus it is particularly well suited for ligand-binding site prediction in weakly homologous protein models. FINDSITE employs template identification, structure superimposition and binding sites clustering as follows: First, for a given target sequence, structure templates are selected from a non-redundant PDB library by the threading program PROSPECTOR_3 34,35. PROSPECTOR_3 evaluates the score significance in terms of the Z-score of the sequence assigned to a given structure based on the average of the best alignment given by Dynamic Programming over the template library. FINDSITE requires threading templates with Z-scores ε4. For the purpose of benchmarking, from the threading templates reported by PROSPECTOR_3 we used only those that have <35% sequence identity to the target protein. Subsequently, structures that contain a bound ligand molecule are identified and superimposed onto a reference structure using the structural alignment algorithm TM-align 65. In this study, we used TASSER-generated 3638 models as reference structures for the template superimposition. Upon superimposition, the centers of mass of ligands bound to threading templates are clustered. Then each cluster represents one putative binding site. Finally, the predicted binding sites are ranked according to the number of threading templates that share a common binding pocket. For each target protein, we selected a top-ranked predicted ligand-binding site for ligand docking and the derivation of a potential for pocket-specific protein-ligand interactions.

Footnotes

Supplementary material

The list of training and benchmark complexes, coordinates of modeled protein structures, the results of ligand binding site prediction and docking of ligands into protein models as well as the optimized potential parameters and derived pocket-specific restraints may be found at http://cssb.biology.gatech.edu/skolnick/files/Q-Dock/

References

1. Gilson MK, Zhou HX. Annu Rev Biophys Biomol Struct. 2007;36:21–42. [PubMed]
2. Hermann JC, Marti-Arbona R, Fedorov AA, Fedorov E, Almo SC, Shoichet BK, Raushel FM. Nature. 2007;448(7155):775–779. [PMC free article] [PubMed]
3. Joseph-McCarthy D, Baber JC, Feyfant E, Thompson DC, Humblet C. Curr Opin Drug Discov Devel. 2007;10(3):264–274. [PubMed]
4. Ewing TJ, Makino S, Skillman AG, Kuntz ID. J Comput Aided Mol Des. 2001;15(5):411–428. [PubMed]
5. Meiler J, Baker D. Proteins. 2006;65(3):538–548. [PubMed]
6. Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ. J Comput Chem. 1998;19:1639–1662.
7. Rarey M, Kramer B, Lengauer T, Klebe G. J Mol Biol. 1996;261(3):470–489. [PubMed]
8. Ferrara P, Gohlke H, Price DJ, Klebe G, Brooks CL., 3rd J Med Chem. 2004;47(12):3032–3047. [PubMed]
9. Perola E, Walters WP, Charifson PS. Proteins. 2004;56(2):235–249. [PubMed]
10. Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS. J Med Chem. 2006;49(20):5912–5931. [PubMed]
11. Cummings MD, DesJarlais RL, Gibbs AC, Mohan V, Jaeger EP. J Med Chem. 2005;48(4):962–976. [PubMed]
12. Kellenberger E, Rodrigo J, Muller P, Rognan D. Proteins. 2004;57(2):225–242. [PubMed]
13. Bissantz C, Bernard P, Hibert M, Rognan D. Proteins. 2003;50(1):5–25. [PubMed]
14. Enyedy IJ, Ling Y, Nacro K, Tomita Y, Wu X, Cao Y, Guo R, Li B, Zhu X, Huang Y, Long YQ, Roller PP, Yang D, Wang S. J Med Chem. 2001;44(25):4313–4324. [PubMed]
15. Evers A, Hessler G, Matter H, Klabunde T. J Med Chem. 2005;48(17):5448–5465. [PubMed]
16. McGovern SL, Shoichet BK. J Med Chem. 2003;46(14):2895–2907. [PubMed]
17. Erickson JA, Jalaie M, Robertson DH, Lewis RA, Vieth M. J Med Chem. 2004;47(1):45–55. [PubMed]
18. Kim RG, Skolnick J. J Comput Chem. 2007 submitted.
19. Zhang Y, Devries ME, Skolnick J. PLoS Comput Biol. 2006;2(2):e13. [PMC free article] [PubMed]
20. Kryshtafovych A, Venclovas C, Fidelis K, Moult J. Proteins. 2005;61(Suppl 7):225–236. [PubMed]
21. Moult J, Fidelis K, Rost B, Hubbard T, Tramontano A. Proteins. 2005;61(Suppl 7):3–7. [PubMed]
22. Brylinski M, Skolnick J Proc Natl Acad Sci U S A. 2007 submitted.
23. Huang SY, Zou X. Proteins. 2007;66(2):399–421. [PubMed]
24. Ferrari AM, Wei BQ, Costantino L, Shoichet BK. J Med Chem. 2004;47(21):5076–5084. [PMC free article] [PubMed]
25. Zhao Y, Sanner MF. Proteins. 2007;68(3):726–737. [PubMed]
26. Vakser IA. Biopolymers. 1996;39(3):455–464. [PubMed]
27. Wojciechowski M, Skolnick J. J Comput Chem. 2002;23(1):189–197. [PubMed]
28. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. Nucleic Acids Res. 2000;28(1):235–242. [PMC free article] [PubMed]
29. Sobolev V, Sorokine A, Prilusky J, Abola EE, Edelman M. Bioinformatics. 1999;15(4):327–332. [PubMed]
30. Willett P, Winterman VA. Quant Struct Act Relat. 1986;5:18–25.
31. Bursulaya BD, Totrov M, Abagyan R, Brooks CL., 3rd J Comput Aided Mol Des. 2003;17(11):755–763. [PubMed]
32. Chen HM, Liu BF, Huang HL, Hwang SF, Ho SY. J Comput Chem. 2007;28(2):612–623. [PubMed]
33. Taufer M, Crowley M, Price DJ, Chien AA, Brooks CL. Concurrency and Computation: Practice and Experience. 2005;17(14):1627–1641.
34. Skolnick J, Kihara D. Proteins. 2001;42(3):319–331. [PubMed]
35. Skolnick J, Kihara D, Zhang Y. Proteins. 2004;56(3):502–518. [PubMed]
36. Zhang Y, Arakaki AK, Skolnick J Proteins. 2005;61(Suppl 7):91–98. [PubMed]
37. Zhang Y, Skolnick J. Biophys J. 2004;87(4):2647–2655. [PMC free article] [PubMed]
38. Zhang Y, Skolnick J. Proc Natl Acad Sci U S A. 2004;101(20):7594–7599. [PMC free article] [PubMed]
39. Cavallo L, Kleinjung J, Fraternali F. Nucleic Acids Res. 2003;31(13):3364–3366. [PMC free article] [PubMed]
40. Gohlke H, Hendlich M, Klebe G. J Mol Biol. 2000;295(2):337–356. [PubMed]
41. Wodak SJ, Janin J. J Mol Biol. 1978;124(2):323–342. [PubMed]
42. Wodak SJ, Janin J. Proc Natl Acad Sci U S A. 1980;77(4):1736–1740. [PMC free article] [PubMed]
43. Back T, Hoffmeister F, Schwefel HP. In: Proceedings of the 4th International Conference on Genetic Algorithms. Belew RK, Booker LB, editors. Morgan Kaufmann; San Diego, CA, USA: 1991. pp. 2–9.
44. Back T, Schwefel HP. Evolutionary Computation. 1993;1(1):1–23.
45. Zhang Y, Kolinski A, Skolnick J. Biophys J. 2003;85(2):1145–1164. [PMC free article] [PubMed]
46. James F. MINUIT, Reference Manual, Version 94.1. CERN; Geneva, Switzerland: 1998.
47. Knuth DE. The Art of Computer Programming: Seminumerical Algorithms. Addison-Wesley; Reading, Massachusetts: 1997. pp. 135–136.
48. Lorber DM, Shoichet BK. Protein Sci. 1998;7(4):938–950. [PMC free article] [PubMed]
49. Goodsell DS, Morris GM, Olson AJ. J Mol Recognit. 1996;9(1):1–5. [PubMed]
50. Nelder JA, Mead R. Computer Journal. 1964;7:308–313.
51. Fukunishi H, Watanabe O, Takada S. J Chem Phys. 2002;116(20):9058–9067.
52. Sugita Y, Kitao A, Okamoto Y. J Chem Phys. 2000;113(15):6042–6051.
53. Zhang Y, Skolnick J. J Chem Phys. 2001;115(11):5027–5032.
54. Pearlman DA, Case DA, Caldwell JW, Ross WR, Cheatham TE, DeBolt IS, Ferguson D, Seibel G, Kollman PA. Comp Phys Commun. 1995;91:1–41.
55. Duan Y, Wu C, Chowdhury S, Lee MC, Xiong G, Zhang W, Yang R, Cieplak P, Luo R, Lee T, Caldwell J, Wang J, Kollman P. J Comput Chem. 2003;24(16):1999–2012. [PubMed]
56. Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA. J Comput Chem. 2004;25(9):1157–1174. [PubMed]
57. Guha R, Howard MT, Hutchison GR, Murray-Rust P, Rzepa H, Steinbeck C, Wegner J, Willighagen EL. J Chem Inf Model. 2006;46(3):991–998. [PubMed]
58. Gasteiger J, Marsili M. Tetrahedron Lett. 1978;34:3181–3184.
59. Murray CW, Baxter CA, Frenkel AD. J Comput Aided Mol Des. 1999;13(6):547–562. [PubMed]
60. Bindewald E, Skolnick J. J Comput Chem. 2005;26(4):374–383. [PubMed]
61. Schafferhans A, Klebe G. J Mol Biol. 2001;307(1):407–427. [PubMed]
62. Vakser IA. Protein Eng. 1995;8(4):371–377. [PubMed]
63. Fraternali F, Cavallo L. Nucleic Acids Res. 2002;30(13):2950–2960. [PMC free article] [PubMed]
64. Lee B, Richards FM. J Mol Biol. 1971;55(3):379–400. [PubMed]
65. Zhang Y, Skolnick J. Nucleic Acids Res. 2005;33(7):2302–2309. [PMC free article] [PubMed]
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles
  • Structure
    Structure
    Published 3D structures
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...