Logo of narLink to Publisher's site
Nucleic Acids Res. 2006; 34(13): 3698–3707.
Published online 2006 Aug 7. doi:  10.1093/nar/gkl454
PMCID: PMC1540721

Protein binding site prediction using an empirical scoring function


Most biological processes are mediated by interactions between proteins and their interacting partners including proteins, nucleic acids and small molecules. This work establishes a method called PINUP for binding site prediction of monomeric proteins. With only two weight parameters to optimize, PINUP produces not only 42.2% coverage of actual interfaces (percentage of correctly predicted interface residues in actual interface residues) but also 44.5% accuracy in predicted interfaces (percentage of correctly predicted interface residues in the predicted interface residues) in a cross validation using a 57-protein dataset. By comparison, the expected accuracy via random prediction (percentage of actual interface residues in surface residues) is only 15%. The binding sites of the 57-protein set are found to be easier to predict than that of an independent test set of 68 proteins. The average coverage and accuracy for this independent test set are 30.5 and 29.4%, respectively. The significant gain of PINUP over expected random prediction is attributed to (i) effective residue-energy score and accessible-surface-area-dependent interface-propensity, (ii) isolation of functional constraints contained in the conservation score from the structural constraints through the combination of residue-energy score (for structural constraints) and conservation score and (iii) a consensus region built on top-ranked initial patches.


Biological cells function through a network of interacting proteins and other molecules. It has been estimated that the average number of interacting partners per protein is three to ten (1). In order for a protein to interact dynamically with multiple partners, the complexes of interacting proteins are often not obligatory but necessary transient with relatively weak binding affinity. Such a weak binding affinity, however, makes it difficult to solve the structures of transient complexes experimentally. As a result, there is a growing gap between the number of known interactions and the number of their 3-dimensional structures that are available. However, the 3D structures of protein complexes are pivotal for a full understanding of the mechanism of interactions because they provide specific interaction details at the atomic level. Such details are important for rational design of drug molecules to modulate protein interactions.

One way to solve this problem is molecular docking (2,3). In molecular docking, transient complex structures are predicted by docking one monomeric structure (typically the smaller one) onto the other. It consists of two steps: conformational sampling that generates multiple candidate complex structures and scoring that recognizes the near-native complex structures from the candidate complex structures. Here, we define a complex structure as transient if there exist corresponding monomeric structures. The accuracy of protein–protein docking (48) can be improved significantly if their binding region is known. This is because identification of binding regions dramatically reduces the conformational space of docking. Several recent studies attempted to predict possible protein–protein binding sites (interface residues) from known unbound monomer structures (9,10).

To predict interface residues, one needs to know what distinguishes an interface region from the rest of the protein surface. It was discovered that the interfaces of obligate complexes are outstandingly hydrophobic (11). The interfaces of some transient complexes were also found to be with clusters of hydrophobic residues (12). Moreover, they are rich in aromatic residues and arginine but depleted in other charged residues (13). However, hydrophobicity at the interfaces of transient complexes is not as distinguishable from the remainder of the surface as hydrophobicity at the interfaces of the obligate complexes (13,14). As a result, it is difficult to make an accurate prediction of the interfaces of transient complexes using a single parameter of residue interface propensity. Moreover, different interface properties of obligate and transient complexes make it necessary to treat them separately in prediction.

Evolution conservation of residues is another widely-used property to identify protein–protein interfaces (1519). Interface residues, especially those hot spot residues (20) were found to be more conserved than other surface residues. However, residue conservation is rarely sufficient for a complete and accurate prediction of protein interface (2123). Moreover, transient interfaces evolve faster than obligate ones (24). A more sensitive evolutionary tracing (ET) method, which delineates invariant residues responsible for subgroup accuracy, has been developed to uncover functionally important residues in proteins (6,25,26).

Several authors studied residue-energy distributions on the protein surface to identify functional sites. Hertzberg and Moult found that steric strains in the polypeptide backbone are located overwhelmingly in regions concerned with function (27). Elcock (28) predicted functionally important residues based on the assumption that they have a high electrostatic energy, as calculated with continuum methods. Ota et al. (29) combined stability profile and sequence conservation to predict catalytic residues in enzymes. Chelliah et al. (30) predicted binding sites by distinguishing restraints that arise from the structure of the protein and those from intermolecular interactions. Cheng et al. (31) used all-atom computational protein design methodology to compute the free energy difference between the naturally occurring amino acids and the lowest free-energy amino acids. Functional sites were found to have residues with sub-optimal free energies.

Other interface-distinguishing features have also been exploited. For example, Jones and Thornton (32) found that protein interfaces are among the most planar and most accessible patches. In structural terms, a binding site has a preference for β-sheets and relatively long non-structured chains, but not for α-helices (10). It was also shown that the binding site of an unbound monomer is surrounded by more bound water molecules and has a lower temperature B-factor than the rest of the surface (10). This result is consistent with the finding that an interfacial surface region is less flexible than the rest of the protein surface (33). More rigid interface region in unbound structures suggests that interface residues are ‘prepared’ for the loss of side-chain conformational entropy upon binding. In another study, Fernandez and Scheraga (34) found that most backbone hydrogen bonds in the majority of soluble proteins are thoroughly wrapped intramolecularly by non-polar groups except for a few that are likely around a binding site.

Because none of the above-mentioned properties are able to make an unambiguous identification of interface regions or patches (32), a combination of some of them [via either a linear combination (9,10) or machine learning (3537)] was found to be effective for improving the accuracy of binding-site prediction. However, the overall accuracy remains low compared to expected random values.

This study is spurred by the finding that residues at the interfaces of protein–protein complexes have higher side-chain energies (i.e. less stable) than the other surface residues (38) based on a score function originally developed for protein design (39). The high energies of the observed interfaces are independent of the amino acid composition in the interface regions. Moreover, residues with high-energy scores are unstable and likely evolve fast. This suggests that the energy score, a suitable indicator of interface residues, is ‘orthogonal’ to properties describing interface propensity and residue conservation. In other words, combination of the three terms might improve the accuracy of interface prediction.

In this paper, we test this hypothesis by developing a method, called Protein Interface residUe Prediction (PINUP). PINUP predicts interface residues using an empirical score function made of a linear combination of the energy score, interface propensity and residue conservation score. To our knowledge, none of the previously published work has combined these terms together to predict protein binding sites. We show that PINUP provides a significant improvement in the accuracy of binding site prediction over any single or pairwise combinations of the three terms. The PINUP server is freely available on http://theory.med.buffalo.edu/PINUP.


The scoring function for interface-residue identification

The scoring function for a surface residue i, E(i), is made of three terms: side-chain energy score, Esidechain(i), residue-conservation score, Econsv(i) and residue interface propensity Epropensity(i). That is,


where wc and wp are to-be-determined weight factors for the conservation and propensity scores, respectively.

1. Side chain energy score

We use a scoring function that was originally developed for protein design (39) to calculate the energy of a rotamer (40), the representative conformation of the amino acid, placed on its backbone position. The score for a given rotamer R of a residue i, Esidechain(Ri), is a linear combination of multiple energetic terms:


where Scontact, Voverlap, Ehbond, Eelec, ΔSpho and ΔSphi are atom-contact surface area, overlap volume, hydrogen bonding energy, electrostatic interaction energy, buried hydrophobic solvent accessible surface and buried hydrophilic solvent accessible surface between the rotamer of residue i and the rest of the protein, respectively; Fphi is the fraction of the buried surface of non-hydrogen-bonded hydrophilic atoms; Δ(Fphi)30 is the difference between the rotamer positioned in the protein environment and the isolated form; Vexclusion is the normalized solvent exclusion volume around charged atoms; f1 is the observed frequency of the rotamer and f2 is the observed frequency of the amino acid residues in a given backbone conformation; Nssbond is the flag of disulfide bridge(1 or 0); ΔGref is the difference between the free energy of the rotamer in solvent and denatured protein. The weights of these energy items together with the reference values were optimized so that the native residue was predicted energetically favorable at each position of the training proteins (39).

The energy score of a particular amino acid i is calculated as:


where the summation is over all the rotamers available for a given residue type and the constant prefactor f is 1/2.41. This constant factor is from the slope of the regression line between the calculated and experimentally-measured unfolding ΔΔG of a set of point mutation data (39). Thus, the energy unit is kcal·mol−1, the same as the experimental data.

2. Residue conservation score

Residue conservation is measured by the self-substitution score from the sequence profile. Sequence profiles are obtained by three iterations of PSI-BLAST searches against non-redundant (NR) database posted on Dec 5, 2005 (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr/gz) with the BLOSUM62 (41) substitution matrix. The conservation score at the position i is defined as


where Mir is the self-substitution score in the position-specific substitution matrix generated from PSI-BLAST for the residue type r at sequence position i and Brr is the diagonal element of BLOSUM62 for residue type r. A slightly different criterion based on positive substitution scores has been used in Ref. (15). Here, we use the difference in self-substitution scores for simplicity.

3. Residue interface propensity

We define a residue-interface propensity, Epropensity(i), that depends on its accessible surface area.


where Printerface and Prsurface are the contribution of residue type r to the interface area and to the protein surface area, respectively, and Sr and Srave are the relative accessible surface area of residue r at the sequence position i and the average relative accessible surface area of surface residues of type r, respectively. The Cα atom of Gly is considered as a side chain atom for convenience. Printerface and Prsurface are obtained from statistical analysis of 75 protein–protein complexes (13). Srave for 20 amino acid residues are obtained from statistical analysis of eight unbound monomers of four protein–protein complexes belonging to different families: barnase-barstar, hen egg white lysozyme-antibody Fab D44.1, acetylcholinesterase-fasciculin and chymotrypsin-OMTKY3. We found that eight unbound monomers yield enough statistics for calculation of Srave. Details can be found in the Results section. The values of (lnPrinterface/Prsurface)/Srave are listed in Table 1.

Table 1
The values of (InPrinterface/Prsurface)/Srave for 20 amino acid residues

PINUP algorithm for predicting the interface residues

The PINUP algorithm is as follows:

  1. Identification of surface residues. As in a previous study (38), surface residues are defined as those side chains with a relative accessibility of >6% (probe radius = 1.2 Å).
  2. Identification of candidate binding surface patches. A surface patch is defined as a central surface residue and 19 nearest neighbors as in a previous study (38). The score of a patch is given by the average value of the scores for all 20 residues by using the above-described scoring function. All of the surface residues are sampled. Solvent vector constraints (32) are applied in order to avoid patches sampling at different sides of a protein surface. Top 5% scored patches are selected. If the number of surface residues for a protein is less than 100, five top-scored patches are selected, instead.
  3. Locating candidate interface residues. Typically, the above selected patches overlap with each other. That is, one residue can appear in multiple patches. We rank residues based on the number of top-scored patches to which they belong (the appearance rate in top-scored patches). Top-ranked 15 residues are designated as candidate interface residues. For large proteins with more than 150 surface residues, we retain up to 10% of total surface residues. If the last candidate residue (e.g. the 15th residue for proteins with less than 150 residues) has the same appearance rate in the top-scored patches as several other non-candidate residues, all of them are included in the candidate interface residues.
  4. Prediction of a continuous binding interface. The final predicted interface is defined as the largest continuous patch made of the ‘interacting’ candidate interface residues. Two residues are considered interacting if the distance between any two respective side chain atoms is <1 Å plus the sum of the van der Waals radius of the two atoms. If a surface residue is surrounded by the predicted interface residues and it does not interact with other surface residues, the residue will be included as interface residues. The van der Waals radii for all atom types are from the CHARMM21 parameter set (42).

There are several parameters, such as the definition of surface residues [Step (1)] and the size of surface patches [Step (2)] in this PINUP algorithm. Effects from varying these parameters are discussed in the Results section.

Protein datasets

We use a set of 57 non-homologous proteins collected by Neuvirth et al. (10) for training and cross validation. In this set, antibodies and antigens are not included since their specific binding mode is optimized through rapid somatic cell mutations instead of evolution over many years. Our algorithm relies on conservation information and, thus, is not suitable for predicting antigen–antibody interfaces. The structures of the unbound monomers and complexes are obtained from PDB (43). The program REDUCE (44) is used to add hydrogen atoms to all proteins. Non-polar hydrogen atoms and all water molecules are deleted. The binding sites are predicted with unbound structures. The complex structures are used to define the experimental interface residues for the unbound monomers. A surface residue is considered as interface residues if its accessible surface area is decreased by more than 1 Å2 upon complexation.

To further test PINUP, we use the protein–protein docking benchmark 2.0 established by Chen et al. (45). This benchmark contains 62 protein complexes (excluding antigen–antibody), in which 68 unbound proteins can be considered as an independent test set because they share <35% sequence identity with any protein in the 57-protein dataset described above. This 68-protein set contains 42, 18 and 8 proteins with minor, medium and large-scale conformational change upon complexation, respectively.

There is a significant homologous relation between the 75 proteins used for deriving interface propensity and the 57 proteins used for cross validation. We test the dependence of prediction accuracy on the dataset used for deriving interface propensity and find that the dependence is essentially negligible. Details can be found in the Results section.

Assessment of prediction accuracy

Prediction accuracy is assessed by the coverage of the actual interface by the predicted interface, which is the fraction of correctly predicted interface residues in the total number of observed interface residues, and the accuracy of the predicted interface, which is the fraction of correctly predicted interface residues in the total number of predicted interface residues. The expected accuracy from random prediction is the fraction of observed interface residues in the total number of surface residues.

Optimizing the weights

We use a simple grid method to optimize the weights of wc and wp. An initial scanning suggests the optimal values located at 0 < wc < 2 and 1 < wp < 10. The final weights are obtained by a simple grid search within 0 < wc < 2 with a step of 0.2 and 1 < wp < 10 with a step of 1. The parameters are optimized for the highest accuracy.


Training and cross validation

Side chain energy scores, residue conservation scores and interface propensity are utilized to predict protein–protein interface for the unbound protein structure. The weights of the three items are balanced to achieve the highest accuracy rate by a grid-based search method. Leave-one-out cross validation is performed on the 57 proteins. Briefly, one protein is removed from the training set. The weights, optimized with the other 56 proteins, are used to predict the binding site of the removed protein. This procedure is repeated 57 times until all the proteins have been left out. In 55 out of the 57 cases, the optimized weights are 1, 1.6 and 6 for energy score, conservation score and interface propensity, respectively. Compared with training on the whole dataset, the average prediction accuracy is only decreased by 0.7% in the leave-one-out procedure. This indicates that the weights obtained are robust for other proteins.

The prediction accuracy for the 57 proteins from the leave-one-out testing is shown in Table 2. The average values of coverage and accuracy are 42 and 45%, respectively. The average size of predicted interfaces is 21 residues, similar to the average size of observed interfaces (22 residues). It is of interest to note that both coverage and accuracy for interface prediction of inhibitors are significantly higher than those of either enzymes or other proteins.

Table 2
Leave-one-out cross validation for 57 unbound protein structures

There are six proteins predicted with more than one interface region. PINUP only accepts the largest region. Nonetheless, both the largest and the second largest regions are overlapped with the observed interface in three of the six proteins. If both regions were considered as the predicted interface, the coverage would be higher.

There are six proteins for which none of the interface residues are correctly predicted. Among them, the unbound crystal structure of nitric oxide synthase oxygenase domain (1nos) has only one interface residue with coordinates. It is the second-ranked interface in PINUP having an overlap with the single interface residue. All other interface residues identified in the bound structure have no structure (invisible in crystal structure) in the unbound structure. That is, these interface residues are not possible to be predicted because they are not part of surface residues in the unbound structure. For the horse plasma gelsolin (1d0n), small number of actual interface residues in a large surface may have led to the failure of PINUP to locate its interface residues. This protein with 589 surface residues has a relatively small interface of 24 residues. The failure is also likely due to the difficulty to generate the surface patches with same shape and size as the observed interface for a given unbound structure. A closer examination indicates that the observed interface of 1d0n ranks at the top 1 in all generated surface patches of the same size. Another cause for completely failed predictions is the existence of other possible binding sites. The predicted protein binding site of aspartate transcarbamoylase (1ekx) is overlapped with N-phosphonacetyl-l-aspartic acid (PAL) ligand binding site, rather than the observed protein–protein interface. The three remaining proteins with failed interface predictions are domains rather than the whole protein. The interface for domain–domain interactions may have prevented correct predictions. For example, a domain–domain interaction site of Enteropathogenic Escherichia coli intimin C-terminal domain (1f00), is predicted as the protein–protein binding interface.

One interesting question is what happens if bound structures are used for interface prediction. We find that when the bound structure of 1nos is used, the prediction coverage and accuracy are 91 and 50%, respectively. However, in general, the prediction accuracy and coverage by using bound structures is slightly lower than by using the unbound structures. One contributing factor is that the crystal structures of transient complexes (i.e. bound structures) have a lower resolution than the unbound monomers. Moreover, the coordinates of some surface residues (other than the interface residues) are often missing. This leads to misidentification of the exposed core region as surface residues and thus, reduction of the overall accuracy of interface prediction by including the exposed core region in the prediction.

Dissection of interface properties

To understand the relative contribution of each scoring term in binding site prediction, the prediction accuracies of individual scoring terms and their pairwise combinations are shown in Table 3. Residue interface propensity is the most effective factor in predicting protein–protein interfaces with 37% coverage and 35% accuracy, respectively. Accuracy increases by 4% when interface propensity is combined with either energy score or conservation score and by another 6% when combined with both scores. All values are much higher than the average expected accuracy (15%). Figure 1 illustrates the interfaces predicted by the single and combined scoring terms for profilin (1pne). Different single terms give different levels of false positives (in red), true positives (in yellow) and false nagatives (in green) in interface prediction. The combined score gives the best prediction.

Figure 1
Comparison of protein interfaces predicted by single and combined scoring terms Red, predicted interface; green, observed interface; yellow, overlapped regions between predicted and observed interfaces. The interfaces were predicted by residue interface ...
Table 3
Effect of individual and combined scores

One can also compare the average score of interfaces with that of surfaces. We find that only 31 out of the 57 proteins (54%) have more conserved interfaces. This weak-conservation result is consistent with findings of other groups (21,22). By comparison, the interface residues possess higher energy scores in 48 out of the 57 proteins. We also find that interface residues of 41 proteins (out of the 57 proteins) are more solvent accessible than other surface residues. We do not use surface accessibility as a term for binding prediction because a highly solvent-accessible residue usually has a higher energy score (38).

Another way to analyze the importance of individual terms is to test their abilities to rank observed interface in generated interfaces of the same size. As shown in Table 3, the energy score is the best in three terms for ranking observed interfaces. This happens despite that it has the lowest accuracy in interface prediction. The poorer performance of the energy score in prediction than in ranking is because high-energy interface residues are often surrounded by low-energy surface residues. As Table 4 shows, while the scores of interface residues are distinctively high, the scores of their surrounding residues are lower than those regular surface residues. Inclusion of non-interface surrounding residues leads to the low average rank (4.1) of the generated 20-residue initial patches that have maximum overlap (13 residues in average) with the observed binding interface, compared to 2.7 of the native interface.

Table 4
The average side chain energies of interface, surrounding and other surface residuesa

Dependence on empirical parameters

There are many empirical parameters in the PINUP algorithm. For example, surface residues are defined as those residues with >6% relative accessibility. We chose this small cut-off value for surface residues so that it is relative easy for the formation of a continuous path from the selected interface resides. If this cut-off value is set to 25%, we have to relax the definition of two connected residues in order to make possible the prediction of a continuous binding patch. Thus, we further define that two residues are connected when the distance between any two respective side chain atoms is <2 Å (rather than 1 Å) plus the sum of the van der Waals radii of the two atoms. The PINUP algorithm with the new cut-off values for surface residues and connected residues and new optimized weight factors yields an average coverage of 45%, accuracy of 44% and expected accuracy of 16%, respectively for the 57 proteins. Thus, the change of prediction accuracy (from 43, 45 and 15% for coverage, accuracy and expected accuracy, respectively) is small, compared to the change of cut-off values. Another important parameter is the size of the candidate surface patches. We set it to 20, a number close to the average size of the observed interfaces of the 57 proteins (22). If the patch size is set to 15 (25), the average prediction coverage and accuracy for the same 57 proteins are 40% (43%) and 40% (42%), respectively. Changes of other parameters are also tested. For example, selecting 10% of top ranked patches (rather than 5%) leads to 41% for coverage and 46% for accuracy. Selecting top 10 (or 20, default 15) residues as potential interface residues for small proteins leads to 38% (48%) for coverage and 46% (41%) for accuracy. All these calculations indicate that the overall accuracy of interface prediction has a weak dependence on empirical parameters. In general, if the coverage is higher, the accuracy is lower and vice versa. We have chosen the parameters so that coverage and accuracy are similar in magnitude.

Independent testing

PINUP is tested on the protein–protein docking bechchmark 2.0. The results for the whole set and a subset of proteins non-homologous (independent) to the training set are shown (Table 5). The average coverage, accuracy for the independent 68-protein set are 30.5 and 29.4%, respectively. They are 12 or 15% lower than the corresponding cross-validation value of the 57 training proteins.

Table 5
Testing PINUP with the whole (in parentheses) and a subset of non-homologous proteins in the protein–protein docking benchmark 2.0a

The test set is further classified into rigid-body (easy), medium difficult and difficult sets according to the magnitude of conformational change after binding (45). Generally, the coverage and accuracy are reduced as the magnitude of binding-induced conformational changes increases (Table 5). Each subset is divided into enzymes, inhibitors and others. It is clear that most accurate predictions are made for inhibitors, as in the Jacknife cross validation of the 57-protein set. The average coverage and accuracy for 12 non-homologous inhibitors are 52.4 and 51.5%, respectively. The corresponding numbers are 26.5 and 26.6% for 7 enzymes, respectively and 25.7 and 24.4% for 49 other proteins, respectively. The excess accuracy over the expected accuracy is also the highest for inhibitors (27.2% for inhibitors versus 13.4% for enzymes and 12.5% for others).

The significantly reduced accuracy for the test set is not caused by over-training. The average coverage and accuracy (31.8 and 32.3%, respectively) is improved only a little even when the 68 proteins are used to train the two free parameters. The optimized weights for energy score, conservation score and propensity score are 1, 1.4 and 5, respectively. When the 57 proteins of Neuvirth et al. (10) are predicted with the new parameters, the coverage and accuracy are 43 and 44%, respectively.

The significantly reduced accuracy for the test set is also not caused by the significant homologous relation between the 75 proteins used for deriving interface propensity and the 57 proteins used for cross validation. We recalculate interface propensity by using the 68 protein complexes of the independent benchmark. The resulting propensity is used to predict the interfaces of the same 68 protein complexes. The prediction coverage and accuracy of the new method after re-optimizing the weight factors (31 and 32% for coverage and accuracy, respectively) are essentially unchanged from the results from independent testing (31 and 29%, respectively). This illustrates that interface propensity derived from the testing protein complexes did not help to improve the accuracy of interface prediction for those protein complexes.

We also recalculate interface propensity by using the 57 non-homologous protein complexes of the training and validation benchmark. For a given protein complex, it is excluded from the calculation of interface propensity and training of weight parameters. This jackknife cross validation yielded 40 and 41% for the average coverage and accuracy, respectively. (The average coverage and accuracy are 40 and 42%, respectively, when both interface propensity and Srave are derived from the training and cross validation set). This is only 3% (or 4%) lower than the average coverage (or accuracy) predicted from the original interface propensity. The result further confirms that the homologous proteins used for calculation of interface propensity is not responsible for the higher accuracy for the training and cross-validation benchmark.

Lower prediction accuracy for the test benchmark means that the binding sites of proteins in the training benchmark are easier to predict than that in the testing benchmark. This is because the binding interfaces for the proteins in the testing benchmark are less distinguishable from the rest of protein surfaces than in the training benchmark. Indeed, the difference of the interface propensity between 20 amino acid residues derived from the independent benchmark of 68 proteins is significantly smaller than that of the original interface propensity. Moreover, interfaces regions are more conserved than surface regions only in 26 of the 68 proteins (38%), compared to 54% for the 57-protein training set. The difficulty to predict interfaces of some proteins are also found in other studies (36,37). The varying performance in different benchmarks indicates the limitation of existing benchmarks. This is the direct consequence of a limited number of NR transient complex structures available in the protein databank. Thus, it would be more representative to combine two benchmarks into one. Because 56 proteins (or their homologs) in the 57 protein training set are in the protein–protein docking benchmark 2.0, the results for all 124 proteins in this benchmark (36.3 coverage and 37.5% for accuracy, see Table 5) provide a more reasonable estimate of the overall accuracy for PINUP.

Comparison with other methods

It is difficult to compare performance made by different methods for binding site prediction. Some (10,35) optimize their methods to make a few accurate predictions (high accuracy and low coverage) while others (36) emphasize on a large coverage of actual binding site (high coverage). More importantly, different definitions of surface/interface residues also produce different expected values from random prediction. For example, a relaxed definition of interface residues will lead to an increase in expected accuracy. Thus, the comparison between different methods has to focus on the excess from expected values. Here, we make an attempt to have a comparison between PINUP and two recent studies.

Neuvirth et al. (10) developed a method called ProMate to distinguish interface regions from the rest of protein surfaces based on 13 properties. For the same 57 proteins for training and cross validation, they reported a success rate of 36 proteins with accuracy of 50% or higher. This rate is significantly higher than a random success rate of 13 proteins. By comparison, the corresponding success rate and expected success rate for PINUP are 25 and 3 proteins, respectively. Thus, the excess success rate from the expected value given by PINUP (22) is similar to that given by ProMate (23). However, PINUP gives a significantly higher number of proteins with coverage of 50% or higher than ProMate does. It is 20 for PINUP but 0 for ProMate. Here, we used the same random model proposed by Neuvirth et al. to generate expected values. In this model, the predicted scores are reshuffled over the amino acid residues before extracting the patches. The low rate of correct prediction in application of this random model to the PINUP algorithm is mostly due to the relatively large number of predicted interface residues in this study. The larger the number of predicted residues, the smaller the random probability that half of the predicted residues are true interface residues. By comparison, ProMate focuses on making a small number of predictions and thus intrinsically has a high expected success rate. Thus, PINUP is as accurate as ProMate in terms of the location of interface but has a larger overlap with the actual interface.

Chen and Zhou (37) developed a method called PPISP, which trained a neural network with sequence profiles and solvent accessibility of spatially neighboring surface residues to predict interface residues. In their paper, they defined coverage and accuracy as the fraction of correctly predicted interface residues for all proteins in total observed interface residues for all proteins and in total predicted interface residues for all proteins, respectively. Similarly, the expected accuracy was defined as the fraction of total interface residues in total surface residues for all proteins. To distinguish from the coverage and accuracy defined in this work, we called coverage and accuracy defined by Chen and Zhou as the overall coverage and overall accuracy, respectively. Their method was tested on protein–protein docking benchmark 1.0 of Chen et al. (45) The overall coverage and overall accuracy is 50% and 50% for the enzyme-inhibitor category but only 28 and 31%, respectively, for other proteins in the rigid body group. PINUP is tested on the more extensive benchmark 2.0. Results are shown in Table 5. PINUP yields 49% of overall coverage and 52% of overall accuracy for the enzyme-inhibitor category. The corresponding values are 29 and 27%, respectively, for others. Thus, the overall accuracy is similar between PINUP and PPISP. However, the overall expected accuracy (13% for enzymes, 20% for inhibitors and 10% for others) of PINUP for this benchmark may be lower than that of PPISP (26% for all heterodimers in the training set).


In this paper, we present the method PINUP to locate binding sites of unbound structures of proteins. This method is based on a scoring function that is a linear combination of a side-chain energy score, interface propensity and residue conservation score. We find that the linear combination provides a 10% improvement in accuracy and 6% improvement in coverage over the residue interface propensity—the best single-term score function. Thus, combining independent parameters is one reason for achieving high coverage and high accuracy at the same time.

The combination of residue-energy and conservation scores also serves the purpose of separation of functional constraints and structural constraints for binding site predictions. A residue with a low energy is likely conserved for structural constraints. Subtraction of the negative energy score (expected conservation score for structural reasons) from the observed conservation score likely produces a score to better identify functionally important residues. In the work of Chelliah et al. structural constraints are separated from functional constraints by using surface accessibility (30). Here we use a residue energy score because the energy score is more effective than solvent accessibility in distinguishing interface regions from the rest of protein surface (38).

Another reason for the improved prediction of PINUP is to define interface as the largest continuous patches composed of residues having highest appearance in 5% top-scored initial patches, rather than as the top-scored initial patch. This consensus over top-scored initial patches permits the prediction of a patch with varied shape and size whereas the initial patch is round and fixed at 20 residues. More importantly, the interfaces in large unbound monomers are often made of relatively buried, low-energy residues surrounded by high-energy residues (38). As a result, a single top-scoring patch centered on a peripheral residue will contain only a small number of the interface residues. Indeed, if only one top-scored initial patch is selected, the coverage and accuracy are only 38 and 39%, respectively, for the set of 57 proteins.

The improvement in binding-site prediction is also in part due to the use of native coordinates. Side-chain energy score, originally developed for protein design is a weighted average energy score over all possible rotamers. Unlike the early work (38), the coordinates of the rotamer that is closest to the native side chain conformation are replaced by the coordinates of native side-chain to reduce the discrete error. Without this replacement, the combined score would produce 42% of accuracy rather than 45% (Table 3).

Recently, Cheng et al. found that functional sites are more likely (than non-functional sites) to have computed sequence profiles that differ significantly from the naturally occurring sequence profiles and to have residues with sub-optimal free energies (31). We have tested a new energy score ln(n + 1), where n is the number of other residue types having lower energies than the replaced native residue. This single residue-energy score yields a accuracy of 24% and its combination with conservation score yields a accuracy of 33%. This performance is much better than the original energy score. However, the combination of this new score with conservation and interface propensity leads to an accuracy of 43%. It is still slightly lower than 45% by PINUP. One possible reason is the stronger correlation between the new energy score and interface propensity. Residues with a high-energy score of the new energy term often are exposed hydrophobic residues, which also have high interface propensity score. In addition, the redefined energy score cannot be calculated with the native side-chain conformation. The new energy scores of all residue types were calculated with rotamers because there is no native conformation for mutants and this discretization error may also affect the prediction accuracy.

We also test a method that defines interface residues based on the scores of individual residues. The levels of accuracy for predicted interfaces are 22, 21 and 24% for the top 15% surface residues by using energy score, conservation score and interface propensity, respectively. Compared to the corresponding values in Table 3, the use of patch is useful for a more accurate prediction of binding sites for conservation score and interface propensity but not for the energy score. This further highlights the negative effect of high-energy interface residues surrounded by low-energy surface residues on the ability of the energy score to rank initial patches.

One perhaps notes that some false positives predicted by PINUP are actually the binding site of other molecules instead of the expected partner protein. This is because proteins often have multiple interacting partners. Some bind to ligands and others interact with other proteins or nucleic acids. The predicted protein binding site of aspartate transcarbamoylase (1ekx) is overlapped with the PAL ligand binding site, rather than the observed protein–protein interface. A domain–domain interaction site of Enteropathogenic E.coli intimin C-terminal domain (1f00), is predicted as the protein–protein binding interface. Thus, in future, we plan to develop a significance score for predicted interface regions. Those with a high significance score (rather than only one interface) will be treated as predicted interfaces. This will allow the possibility of predicting multiple interface regions for a given protein.

One interesting observation is that PINUP provides a much more accurate prediction for interfaces of inhibitors. This is true for both training and testing sets. Similar results were observed in the previous study, where interface residues were predicted by merely energy score (38). One possible reason is that inhibitors have a higher expected accuracy. That is, it has a larger interface in a relatively small surface, compared to other proteins.


Funding to pay the Open Access publication charges for this article was provided by National Institutes of Health (R01 GM 066049, 068530), USA and a grant from HHMI to SUNY Buffalo, Y. Z. is also partially supported by a two-base grant (No. 20340420391) from National Science Foundation of China, by the Center for Computational Research and the Keck Center for Computational Biology at SUNY Buffalo.

Conflict of interest statement. None declared.


1. Bork P., Jensen L.J., von Mering C., Ramani A.K., Lee I., Marcotte E.M. Protein interaction networks from yeast to human. Curr. Opin. Struct. Biol. 2004;14:292–299. [PubMed]
2. Aloy P., Russell R.B. Structural systems biology: modelling protein interactions. Nature Rev. Mol. Cell. Biol. 2006;7:188–197. [PubMed]
3. Wodak S.J., Mendez R. Prediction of protein–protein interactions: the CAPRI experiment, its evaluation and implications. Curr. Opin. Struct. Biol. 2004;14:242–249. [PubMed]
4. Mendez R., Leplae R., De Maria L., Wodak S.J. Assessment of blind predictions of protein–protein interactions: current status of docking methods. Proteins. 2003;52:51–67. [PubMed]
5. Liu S., Zhang C., Zhou H., Zhou Y. A physical reference state unifies the structure-derived potential of mean force for protein folding and binding. Proteins. 2004;56:93–101. [PubMed]
6. Aloy P., Querol E., Aviles F.X., Sternberg M.J. Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J. Mol. Biol. 2001;311:395–408. [PubMed]
7. Russell R.B., Alber F., Aloy P., Davis F.P., Korkin D., Pichaud M., Topf M., Sali A. A structural perspective on protein–protein interactions. Curr. Opin. Struct. Biol. 2004;14:313–324. [PubMed]
8. Zhang C., Liu S., Zhou Y. Docking prediction using biological information, ZDOCK sampling technique, and clustering guided by the DFIRE statistical energy function. Proteins. 2005;60:314–318. [PubMed]
9. Jones S., Thornton J.M. Prediction of protein–protein interaction sites using patch analysis. J. Mol. Biol. 1997;272:133–143. [PubMed]
10. Neuvirth H., Raz R., Schreiber G. ProMate: a structure based prediction program to identify the location of protein–protein binding sites. J. Mol. Biol. 2004;338:181–199. [PubMed]
11. Glaser F., Steinberg D.M., Vakser I.A., Ben-Tal N. Residue frequencies and pairing preferences at protein–protein interfaces. Proteins. 2001;43:89–102. [PubMed]
12. Young L., Jernigan R.L., Covell D.G. A role for surface hydrophobicity in protein–protein recognition. Protein Sci. 1994;3:717–729. [PMC free article] [PubMed]
13. Lo Conte L., Chothia C., Janin J. The atomic structure of protein–protein recognition sites. J. Mol. Biol. 1999;285:2177–2198. [PubMed]
14. Jones S., Thornton J.M. Principles of protein–protein interactions. Proc. Natl Acad. Sci. USA. 1996;93:13–20. [PMC free article] [PubMed]
15. Zhou H.X., Shan Y. Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins. 2001;44:336–343. [PubMed]
16. Fariselli P., Pazos F., Valencia A., Casadio R. Prediction of protein–protein interaction sites in heterocomplexes with neural networks. Eur. J. Biochem. 2002;269:1356–1361. [PubMed]
17. Panchenko A.R., Kondrashov F., Bryant S. Prediction of functional sites by analysis of sequence and structure conservation. Protein Sci. 2004;13:884–892. [PMC free article] [PubMed]
18. Pupko T., Bell R.E., Mayrose I., Glaser F., Ben-Tal N. Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002;18:S71–S77. [PubMed]
19. Chung J.L., Wang W., Bourne P.E. Exploiting sequence and structure homologs to identify protein–protein binding sites. Proteins. 2006;62:630–640. [PubMed]
20. Hu Z., Ma B., Wolfson H., Nussinov R. Conservation of polar residues as hot spots at protein interfaces. Proteins. 2000;39:331–342. [PubMed]
21. Bradford J.R., Westhead D.R. Asymmetric mutation rates at enzyme-inhibitor interfaces: implications for the protein-protein docking problem. Protein Sci. 2003;12:2099–2103. [PMC free article] [PubMed]
22. Caffrey D.R., Somaroo S., Hughes J.D., Mintseris J., Huang E.S. Are protein–protein interfaces more conserved in sequence than the rest of the protein surface? Protein Sci. 2004;13:190–202. [PMC free article] [PubMed]
23. Grishin N.V., Phillips M.A. The subunit interfaces of oligomeric enzymes are conserved to a similar extent to the overall protein sequences. Protein Sci. 1994;3:2455–2458. [PMC free article] [PubMed]
24. Mintseris J., Weng Z. Structure, function, and evolution of transient and obligate protein–protein interactions. Proc. Natl Acad. Sci. USA. 2005;102:10930–10935. [PMC free article] [PubMed]
25. Lichtarge O., Bourne H.R., Cohen F.E. An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 1996;257:342–358. [PubMed]
26. Madabushi S., Yao H., Marsh M., Kristensen D.M., Philippi A., Sowa M.E., Lichtarge O. Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J. Mol. Biol. 2002;316:139–154. [PubMed]
27. Herzberg O., Moult J. Analysis of the steric strain in the polypeptide backbone of protein molecules. Proteins. 1991;11:223–229. [PubMed]
28. Elcock A.H. Prediction of functionally important residues based solely on the computed energetics of protein structure. J. Mol. Biol. 2001;312:885–896. [PubMed]
29. Ota M., Kinoshita K., Nishikawa K. Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. J. Mol. Biol. 2003;327:1053–1064. [PubMed]
30. Chelliah V., Chen L., Blundell T.L., Lovell S.C. Distinguishing structural and functional restraints in evolution in order to identify interaction sites. J. Mol. Biol. 2004;342:1487–1504. [PubMed]
31. Cheng G., Qian B., Samudrala R., Baker D. Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. Nucleic Acids Res. 2005;33:5861–5867. [PMC free article] [PubMed]
32. Jones S., Thornton J.M. Analysis of protein–protein interaction sites using surface patches. J. Mol. Biol. 1997;272:121–132. [PubMed]
33. Cole C., Warwicker J. Side-chain conformational entropy at protein–protein interfaces. Protein Sci. 2002;11:2860–2870. [PMC free article] [PubMed]
34. Fernandez A., Scheraga H.A. Insufficiently dehydrated hydrogen bonds as determinants of protein interactions. Proc. Natl Acad. Sci. USA. 2003;100:113–118. [PMC free article] [PubMed]
35. Bradford J.R., Westhead D.R. Improved prediction of protein–protein binding sites using a support vector machines approach. Bioinformatics. 2005;21:1487–1494. [PubMed]
36. Bordner A.J., Abagyan R. Statistical analysis and prediction of protein–protein interfaces. Proteins. 2005;60:353–366. [PubMed]
37. Chen H., Zhou H.X. Prediction of interface residues in protein–protein complexes by a consensus neural network method: test against NMR data. Proteins. 2005;61:21–35. [PubMed]
38. Liang S., Zhang J., Zhang S., Guo H. Prediction of the interaction site on the surface of an isolated protein structure by analysis of side chain energy scores. Proteins. 2004;57:548–557. [PubMed]
39. Liang S., Grishin N.V. Effective scoring function for protein sequence design. Proteins. 2004;54:271–281. [PubMed]
40. Dunbrack R.L., Jr, Cohen F.E. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 1997;6:1661–1681. [PMC free article] [PubMed]
41. Henikoff S., Henikoff J.G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA. 1992;89:10915–10919. [PMC free article] [PubMed]
42. Brooks B.R., Bruccoleri R.E., Olafson B.D., States D.J., Swaminathan S., Karplus M. CHARMM: a program for macromolecular energy, minimization and dynamics calculation. J. Comput. Chem. 1983;4:187–217.
43. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. [PMC free article] [PubMed]
44. Word J.M., Lovell S.C., Richardson J.S., Richardson D.C. Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J. Mol. Biol. 1999;285:1735–1747. [PubMed]
45. Chen R., Mintseris J., Janin J., Weng Z. A protein–protein docking benchmark. Proteins. 2003;52:88–91. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...