• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of bmcbioiBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Bioinformatics
BMC Bioinformatics. 2007; 8: 373.
Published online Oct 5, 2007. doi:  10.1186/1471-2105-8-373
PMCID: PMC2174957

qPIPSA: Relating enzymatic kinetic parameters and interaction fields

Abstract

Background

The simulation of metabolic networks in quantitative systems biology requires the assignment of enzymatic kinetic parameters. Experimentally determined values are often not available and therefore computational methods to estimate these parameters are needed. It is possible to use the three-dimensional structure of an enzyme to perform simulations of a reaction and derive kinetic parameters. However, this is computationally demanding and requires detailed knowledge of the enzyme mechanism. We have therefore sought to develop a general, simple and computationally efficient procedure to relate protein structural information to enzymatic kinetic parameters that allows consistency between the kinetic and structural information to be checked and estimation of kinetic constants for structurally and mechanistically similar enzymes.

Results

We describe qPIPSA: quantitative Protein Interaction Property Similarity Analysis. In this analysis, molecular interaction fields, for example, electrostatic potentials, are computed from the enzyme structures. Differences in molecular interaction fields between enzymes are then related to the ratios of their kinetic parameters. This procedure can be used to estimate unknown kinetic parameters when enzyme structural information is available and kinetic parameters have been measured for related enzymes or were obtained under different conditions. The detailed interaction of the enzyme with substrate or cofactors is not modeled and is assumed to be similar for all the proteins compared. The protein structure modeling protocol employed ensures that differences between models reflect genuine differences between the protein sequences, rather than random fluctuations in protein structure.

Conclusion

Provided that the experimental conditions and the protein structural models refer to the same protein state or conformation, correlations between interaction fields and kinetic parameters can be established for sets of related enzymes. Outliers may arise due to variation in the importance of different contributions to the kinetic parameters, such as protein stability and conformational changes. The qPIPSA approach can assist in the validation as well as estimation of kinetic parameters, and provide insights into enzyme mechanism.

Background

The ability to estimate enzymatic kinetic parameters is very important for metabolic simulations [1-3]. This is because experimental values of the kinetic parameters measured under exactly the conditions of the model for exactly the proteins in the model are usually missing. Often kinetic parameters have been measured only for a related enzyme or for the correct enzyme but under different conditions, e.g. different temperature, pH or ionic strength. Therefore we developed a method that relates variations in kinetic parameters to differences in protein structures. This method can be used to check the consistency of kinetic measurements described in the literature with available protein structural data as well as to make estimates of kinetic parameters based on enzyme structures and kinetic parameters for related enzymes.

Protein structural information provides a basis for predicting and rationalizing protein function. Given a protein structure, molecular simulations can be performed and these can be used to compute kinetic parameters [4]. For example, Brownian dynamics simulations can be used to simulate substrate-enzyme diffusional association and compute bimolecular association rate constants [5,6]. We previously demonstrated how rate constants computed by Brownian dynamics simulation of the diffusional association of superoxide and myeloperoxidase could be used in the mathematical modeling and simulation of the oscillatory behaviour of metabolite levels in activated white blood cells [2]. The type of molecular simulation to use must be chosen according to the mechanism determining the kinetic parameter. Whereas Brownian dynamics is appropriate for diffusional processes, molecular dynamics techniques may be required to simulate conformational changes and quantum mechanics for chemical reaction steps. These simulations can be computationally demanding and the accurate computation of kinetic parameters by simulations is a challenging and on-going research topic [4]. Therefore, a simpler, less computationally demanding and more robust approach to exploit protein structural information is required in the context of biochemical network simulation. qPIPSA is designed to fulfill this requirement.

In qPIPSA, molecular descriptors are related to kinetic parameters. The molecular descriptors are the molecular interaction fields (MIFs) of the proteins. Molecular interaction fields map the interaction energy between a chemical probe and the target protein as the chemical probe is moved over a grid of points [7]. Diverse chemical probes may be used, e.g. a water molecule, carbonyl oxygen or hydroxyl group. When the probe is a point charge, the molecular interaction field corresponds to the molecular electrostatic potential. Intermolecular interactions are fundamental to enzymatic reactions and are dependent on the enzyme MIFs. MIFs are often used in 3D-QSAR studies to derive quantitative structure-activity relationships (QSARs) [7]. This approach may be taken for proteins [8] as well as for small molecules [9,10]. The QSARs are generally derived by partial least squares (PLS) chemometric procedures for a training set with experimentally determined parameters. While such training procedures can be applied to predict enzyme parameters [11], typically, insufficient experimental data on kinetic parameters are available for training by PLS. In qPIPSA, we therefore employ a simpler linear regression procedure for which only two experimental measurements of a kinetic parameter for enzymes and at least one known three-dimensional structure are required. Further experimental measurements can be used and will help to improve the accuracy of predictions and assess the confidence of the parameter estimates. qPIPSA is based on the PIPSA method [12] which has been used to classify the interaction properties of protein families using MIFs [13,14].

In this paper, we will focus on the use of molecular electrostatic potential as the descriptor MIF in qPIPSA. The molecular electrostatic potential is usually the most informative MIF for this purpose. The long-range nature of electrostatic interactions means that similarities or differences in electrostatic potentials are often not detected in a sequence analysis. Even proteins with low sequence similarity can have rather similar electrostatic potentials [12,15]. For estimating enzyme kinetic parameters, the molecular electrostatic potential is appropriate because electrostatics are considered to be the most important contributor to enzymes' catalytic abilities [16], e.g. to stabilization of the transition state. Electrostatic steering has also been shown to enhance the association rates in fast, diffusion-influenced enzymes, such as superoxide dismutase [17].

In qPIPSA, the MIFs of different proteins and/or the same protein under different conditions, e.g. in a different titration state at a different pH, are compared. The enzyme kinetic parameters kcat and Km are associated with specific substrate-enzyme interactions. For the binding process relevant to the kinetic parameter, only one binding partner (here the enzyme) is modeled. The ligand is thus assumed to be the same or similar for all the protein structures compared. The differences in kinetic parameters are assumed to be determined by differences in the protein MIFs, and these differences are calculated in a region around the active site of the enzyme. Thus, qPIPSA requires no prior knowledge of the reaction mechanism. However, information about the location of the active site or substrate or ligand binding residues is used to define the region over which to compare MIFs. qPIPSA can be used to estimate missing kinetic parameters, to check the consistency of different measurements, and to investigate the mechanistic determinants of kinetic parameters. As such, it is a useful tool for biochemistry studies in general and for biochemical network simulation and comparative systems biology in particular.

In this paper, we first outline the qPIPSA approach. Details are given in the Methods section. Results are described for four illustrative and experimentally well-characterized enzymes. The criteria for choosing these enzymes were the availability of enough experimental kinetic and structural data for validation of the methodology, and a reasonable diversity in enzyme type. First, we present an analysis of acetylcholinesterase (AChE) mutants. This shows that not only inhibitor association rates but also substrate Km and kcat/Km parameters correlate remarkably well with the electrostatic potential differences near the active site of the enzyme. We then analyze superoxide dismutase (SOD) and show that the ionic strength and pH dependence of the rate constants for superoxide can be explained by the changes in the electrostatic potential. Next, we examine triose phosphate isomerase (TPI) enzymes from 12 different species. For TPIs, the electrostatic potential differences are found to be good descriptors for the cross-species variation in kinetic parameters kcat/Km and Km. In the fourth case, for 10 class I fructose-1,6-bisphosphate aldolases (FBA) the correlations are less obvious, but detectable. It appears that the conformation of the C-terminal region of FBA is critical for a description of the kinetics of this enzyme. Finally, we examine different protein structural modeling procedures and how to choose the comparison region for interaction fields, showing that this relates to enzyme mechanism.

Theory

We postulate that the ratio of the kinetic parameters ka and kb of a pair of enzymes, a and b, correlates with the average differences in their molecular interaction fields (MIFs), Φa and Φb:

ln(ka/kb)~αR(ΦaΦb)/R1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacyGGSbaBcqGGUbGBcqGGOaakcqWGRbWAdaWgaaWcbaGaemyyaegabeaakiabc+caViabdUgaRnaaBaaaleaacqWGIbGyaeqaaOGaeiykaKIaeiOFa4hcciGae8xSdeMaeyyXIC9aaabuaeaacqGGOaakcqqHMoGrdaWgaaWcbaGaemyyaegabeaakiabgkHiTiabfA6agnaaBaaaleaacqWGIbGyaeqaaOGaeiykaKcaleaacqWHsbGuaeqaniabggHiLdGccqGGVaWldaaeqbqaaiabigdaXaWcbaGaeCOuaifabeqdcqGHris5aaaa@4E5F@

where R is the region selected for comparison and considered important for the kinetic parameter k. The differences in MIF, Φ, are summed over the grid points in the "skins" (see Methods section for a definition) around the proteins within a distance R from a specified point, defining the region R and divided by the number of points in the overlapping skins to obtain a size-independent measure.

When applied to a set of proteins, the correlation (1) is required for all pairs with known kinetic parameters, and the correlation factor α is derived by minimizing the function

ab{ln(ka/kb)αR(ΦaΦb)/R1}2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaaeqbqaaiabcUha7jGbcYgaSjabc6gaUjabcIcaOiabdUgaRnaaBaaaleaacqWGHbqyaeqaaOGaei4la8Iaem4AaS2aaSbaaSqaaiabdkgaIbqabaGccqGGPaqkcqGHsisliiGacqWFXoqycqGHflY1daaeqbqaaiabcIcaOiabfA6agnaaBaaaleaacqWGHbqyaeqaaOGaeyOeI0IaeuOPdy0aaSbaaSqaaiabdkgaIbqabaGccqGGPaqkaSqaaiabdkfasbqab0GaeyyeIuoakiabc+caVmaaqafabaGaeGymaedaleaacqWGsbGuaeqaniabggHiLdGccqGG9bqFdaahaaWcbeqaaiabikdaYaaaaeaacqWGHbqycqWGIbGyaeqaniabggHiLdaaaa@5699@

The relative percentage correlation error is defined using the RMSD of the left-hand side of (1) from the right-hand side and it is given by formula (3), with α minimizing (2):

100%ab{ln(ka/kb)αR(ΦaΦb)/R1}2/ab(ln(ka/kb))2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqaIXaqmcqaIWaamcqaIWaamcqGGLaqjcqGHflY1daGcaaqaamaaqafabaGaei4EaSNagiiBaWMaeiOBa4MaeiikaGIaem4AaS2aaSbaaSqaaiabdggaHbqabaGccqGGVaWlcqWGRbWAdaWgaaWcbaGaemOyaigabeaakiabcMcaPiabgkHiTGGaciab=f7aHjabgwSixpaaqafabaGaeiikaGIaeuOPdy0aaSbaaSqaaiabdggaHbqabaGccqGHsislcqqHMoGrdaWgaaWcbaGaemOyaigabeaakiabcMcaPaWcbaGaemOuaifabeqdcqGHris5aOGaei4la8YaaabuaeaacqaIXaqmaSqaaiabdkfasbqab0GaeyyeIuoakiabc2ha9naaCaaaleqabaGaeGOmaidaaOGaei4la8YaaabuaeaacqGGOaakcyGGSbaBcqGGUbGBcqGGOaakcqWGRbWAdaWgaaWcbaGaemyyaegabeaakiabc+caViabdUgaRnaaBaaaleaacqWGIbGyaeqaaOGaeiykaKIaeiykaKYaaWbaaSqabeaacqaIYaGmaaaabaGaemyyaeMaemOyaigabeqdcqGHris5aaWcbaGaemyyaeMaemOyaigabeqdcqGHris5aaWcbeaaaaa@7038@

The absolute error may be defined as:

ab{ln(ka/kb)αR(ΦaΦb)/R1}2/ab1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaamaaqafabaGaei4EaSNagiiBaWMaeiOBa4MaeiikaGIaem4AaS2aaSbaaSqaaiabdggaHbqabaGccqGGVaWlcqWGRbWAdaWgaaWcbaGaemOyaigabeaakiabcMcaPiabgkHiTGGaciab=f7aHjabgwSixpaaqafabaGaeiikaGIaeuOPdy0aaSbaaSqaaiabdggaHbqabaGccqGHsislcqqHMoGrdaWgaaWcbaGaemOyaigabeaakiabcMcaPaWcbaGaemOuaifabeqdcqGHris5aOGaei4la8YaaabuaeaacqaIXaqmaSqaaiabdkfasbqab0GaeyyeIuoakiabc2ha9naaCaaaleqabaGaeGOmaidaaOGaei4la8YaaabuaeaacqaIXaqmaSqaaiabdggaHjabdkgaIbqab0GaeyyeIuoaaSqaaiabdggaHjabdkgaIbqab0GaeyyeIuoaaSqabaaaaa@5D5A@

It gives an average deviation of the fit in ln units, but is less useful than expression (3) when comparing different cases with different degrees of kinetic parameter deviation. It is, for example, small in the absence of correlation when kinetic constants differ insignificantly in a set of proteins compared. We also used the Pearson correlation coefficient (R-coefficient) to estimate the degree of overall correlation between two sets of parameters x and y:

r=(xx)(yy)/(xx)2(yy)2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCcqGH9aqpdaaadaqaaiabcIcaOiabdIha4jabgkHiTmaaamaabaGaemiEaGhacaGLPmIaayPkJaGaeiykaKIaeiikaGIaemyEaKNaeyOeI0YaaaWaaeaacqWG5bqEaiaawMYicaGLQmcacqGGPaqkaiaawMYicaGLQmcacqGGVaWldaGcaaqaamaaamaabaGaeiikaGIaemiEaGNaeyOeI0YaaaWaaeaacqWG4baEaiaawMYicaGLQmcacqGGPaqkdaahaaWcbeqaaiabikdaYaaaaOGaayzkJiaawQYiamaaamaabaGaeiikaGIaemyEaKNaeyOeI0YaaaWaaeaacqWG5bqEaiaawMYicaGLQmcacqGGPaqkdaahaaWcbeqaaiabikdaYaaaaOGaayzkJiaawQYiaaWcbeaaaaa@556E@

with <> being an average over all possible values of x or y.

Predictions of kinetic parameters are carried out by first deriving the parameter α for all pairs with known kinetic parameters and then averaging predictions for any unknown case from pairwise comparisons to all proteins with known kinetic parameters. The minimum required number of known parameters equals two in this approach.

When Φ is the molecular electrostatic potential, a physically meaningful estimate for the parameter α is Φ is -q/kBT, where q is the net charge of the substrate, kB is the Boltzmann constant and T is the temperature. This corresponds to the kinetic parameter being determined by the interaction energy of the substrate charge q with the electrostatic potential Φ of the enzyme: k ~ exp(-q·Φ/kBT).

We also tested other possibilities for defining correlations, including

ln(ka/kb)~αln(Re±Φa/Re±Φb) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacyGGSbaBcqGGUbGBcqGGOaakcqWGRbWAdaWgaaWcbaGaemyyaegabeaakiabc+caViabdUgaRnaaBaaaleaacqWGIbGyaeqaaOGaeiykaKIaeiOFa4hcciGae8xSdeMaeyyXIC9aaSGbaeaacyGGSbaBcqGGUbGBcqGGOaakdaaeqbqaaiabdwgaLnaaCaaaleqabaGaeyySaeRaeuOPdy0aaSbaaWqaaiabdggaHbqabaaaaaWcbaGaeCOuaifabeqdcqGHris5aaGcbaWaaabuaeaacqWGLbqzdaahaaWcbeqaaiabgglaXkabfA6agnaaBaaameaacqWGIbGyaeqaaaaaaSqaaiabhkfasbqab0GaeyyeIuoaaaGccqGGPaqkaaa@554C@

For a single point in the region R, formulas (1) and (6) give the same results. When there are many points in the comparison region R, formula (6) will enhance the contributions from large and positive (+ sign) or large and negative (- sign) potentials. In some cases, formula (6) gives a better description of interaction field – kinetic parameter correlations.

Similarity indices, e.g. the Hodgkin index [18]

SIab=2RΦaΦb/(RΦa2+RΦb2) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWucqWGjbqsdaWgaaWcbaGaemyyaeMaemOyaigabeaakiabg2da9iabikdaYiabgwSixpaaqafabaGaeuOPdy0aaSbaaSqaaiabdggaHbqabaaabaGaeCOuaifabeqdcqGHris5aOGaeuOPdy0aaSbaaSqaaiabdkgaIbqabaGccqGGVaWlcqGGOaakdaaeqbqaaiabfA6agnaaDaaaleaacqWGHbqyaeaacqaIYaGmaaGccqGHRaWkaSqaaiabhkfasbqab0GaeyyeIuoakmaaqafabaGaeuOPdy0aa0baaSqaaiabdkgaIbqaaiabikdaYaaaaeaacqWHsbGuaeqaniabggHiLdGccqGGPaqkaaa@5146@, may also be used to compare electrostatic potentials [13]. Similarity indices can assist in assigning a kinetic parameter to a protein when experimental values are available for several related proteins by enabling the most similar protein to be found and hence the kinetic parameter to be estimated [19]. For a more quantitative analysis, a pairwise distance matrix with 1SIab MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaaiabdgdaXiabgkHiTiabdofatjabdMeajnaaBaaaleaacqWGHbqycqWGIbGyaeqaaaqabaaaaa@33A2@ as a distance measure can be constructed. We have previously shown how such a distance matrix can be used to compute relative electron transfer rates between plastocyanin and cytochrome f for a set of plastocyanin mutants based on the known electron transfer rate for wild-type plastocyanin [20]. The formulations based on similarity indices however suffer from loss of the sign information present in equation (1) and therefore do not allow a correlation to be made in a fully automated fashion. Equation (1) provides the most direct model in terms of the physical determinants of the quantities computed and has therefore been used in this manuscript. The correlations obtained with equation (1) are mostly of similar or better quality than those using similarity indices or distance matrices based on similarity indices.

The average Boltzmann factor of the ligand-protein interaction energy when the ligand is near or in the active site [21] is another possible descriptor of kinetic parameters. However, as the average Boltzmann factor is calculated from the interaction energy rather than the interaction field, it requires knowledge of the position and charge distribution of the ligand interacting with the enzyme. The use of MIFs alone in qPIPSA is less demanding in terms of modeling but can nevertheless be applied to comparing different enzymes.

Results and discussions

Molecular electrostatic potentials correlate with inhibitor association rate constants and substrate Km and kcat/Km rate constants for a set of acetylcholinesterase (AChE) mutants

We considered the wild-type mouse AChE and 11 mutants with large changes in kinetic parameters, see Table Table1.1. The experimental kinetic data were taken from reference [22]. The electrostatic potential was computed for the 12 proteins and the average difference in electrostatic potential in the AChE active site gorge was then computed for all pairs of proteins. (See Methods section for a description of the exact region for MIF comparison). A remarkable correlation holds between the difference in ln(kon) (logarithm of association rate constant) of the inhibitor, m-trimethyl-ammonio-trifluoro-aceto-phenone (TFK+) with AChE, and the difference in the electrostatic potentials in the AChE active site gorge, see Fig. Fig.1a.1a. This correlation is consistent with the observations that on-rate constants correlate with the diffusional association rates calculated by Brownian dynamics simulations with electrostatic forces [23], and that diffusional association rates correlate with the average Boltzmann factor of the inhibitor-protein electrostatic interaction energy for positions of the inhibitor around the AChE active site [24]. Previously it was also shown that electrostatic potential values in different parts of the active site correlate with inhibitor kon [25].

Table 1
Kinetic data for mouse acetylcholinesterase (AChE) [22]
Figure 1
(a) Correlation between the differences in experimental inhibitor ln(kon) and differences in electrostatic potentials of AchE and (b). leave-two-out cross-validation for prediction of kon values for the inhibitor TFK+ and AChE. Each point in part (a) ...

As seen in Fig. Fig.1a,1a, a decrease in the average electrostatic potential by 1.39 kcal/mol/e results in an increase of kon for TFK+ by 1 natural log (ln) unit (equivalent to a factor of 2.72). This relation is in agreement with the expectation that ln(kon) is is a linear function of -Φ/kBT, where Φ is an average electrostatic potential in the comparison region. Standard LTO (leave-two-out) cross-validation to assess the predictive ability shows that accurate predictions of kon values can be obtained on the basis of this correlation, see Fig. Fig.1b.1b. Thus an excellent linear correlation between calculated differences in electrostatic potentials and measured ln(kon) values for TFK+ could be achieved and this can be used to predict ln(kon) values for AChE mutants.

A similar correlation was obtained for the difference in electrostatic potentials and the kcat/Km values of acetylthiocholine (ATCh), a substrate of AChE [22] (see Fig. Fig.2a).2a). Here, a decrease of the average electrostatic potential by 1.06 kcal/mol/e results in an increase of kcat/Km for acetylthiocholine by 1 ln unit. The overall correlation seems to be composed of different correlations. For surface residues (filled circles), changes in potential in the region of comparison result in larger changes in kcat/Km than for mutations of active site residues (open circles). This can be expected, because the surface residues influence kcat/Km not only via the potential in the active site comparison region but also because of the influence of the electrostatic potential of surface residues on the substrate as it approaches the active site.

Figure 2
Correlation between (a) experimental ln(kcat /Km and (b) experimental ln(Km for the substrate ATCh and electrostatic potential differences for different AChE mutants. Each point corresponds to the differences for one protein variant pair. The straight ...

The Km values for ATCh are also correlated with the AChE electrostatic potentials as shown in Fig. Fig.2b.2b. A decrease of the electrostatic potential by 1.68 kcal/mol/e results in a decrease of Km for ATCh by 1 ln unit.

Correlations for koff for TFK+ and kcat for ATCh (not shown) are significantly weaker. ln(kcat) is the sum of ln(kcat/Km) and ln(Km). Both of these quantities correlate with the electrostatic potential differences, but with opposite signs. Therefore, their sum appears to have a weak correlation with electrostatic potential differences.

An important finding here is that not only the inhibitor association rate constant but also the intrinsic enzyme kinetic parameters for a substrate can be correlated with the electrostatic potentials near the active site. The results are, however, sensitive to the modeling accuracy. We treated E202 and H447 as singly and doubly protonated, respectively (see Methods section). The importance of correct charge assignments for these residues was discussed previously [23] when performing Brownian dynamics simulations for this system. Assigning standard protonation states at pH 7 to these residues would weaken the correlation between electrostatic potential differences and inhibitor association rate constants and thus the predictive ability. For example, predictions made for 9 AChE mutants using known kon values for the wild-type and the D74N/D280V/D283N mutant of AChE and standard protonation states for E202 and H447 would result in a factor of 20 under-prediction of the inhibitor association rates for the mutants involving E202 (see Fig. Fig.3b).3b). The importance of charge assignments for these residues was discussed previously for Brownian dynamics simulations of this system. Only when the non-standard protonation states of these residues was assigned, was a good correlation between measured and calculated kon obtained (see Fig. Fig.3a3a).

Figure 3
Importance of correct modeling of the protonation states of titratable residues for predicting of kon values for TFK+ and AChE . Predictions were made for 9 AChE mutants using known kon values for the wild-type and D74N/D280V/D283N mutant of AChE at pH ...

The ionic strength and pH dependence of kinetic constants of Cu, Zn- superoxide dismutase (SOD) are reflected in molecular electrostatic potential

Kinetic constants measured under different environmental conditions can be correlated with MIFs computed for the respective conditions. Here, we demonstrate this for the rate constant for the first step of the SOD reaction which is limited by superoxide association to SOD. Bovine SOD was chosen because its reaction mechanism has been thoroughly studied [26] and the consistency of kinetic constants measured under different conditions has been analyzed [27]. The association rate of superoxide to SOD shows a pronounced pH- and ionic strength dependence. The pH was varied between 7.7 and 11.0 (at a constant ionic strength of 20 mM) and the ionic strength was varied between 20 and 250 mM at a constant pH of 7.7.

For both types of variation of environmental conditions, bovine SOD exhibits a distinct correlation between electrostatic potential differences and differences in the experimental kinetic constant. A 1 ln unit change in kinetic constant corresponds to 0.45 kcal/mol/e change of average electrostatic potential in the region within 10 Å of the catalytic copper ion, see Fig. Fig.4a.4a. This ratio also applies when comparing bovine and human SOD at pH 7 and an ionic strength of 20 mM.

Figure 4
(a) Correlation between experimental ln(kcat /Km) for superoxide and electrostatic potential differences for SOD from different organisms or under different conditions, and (b) experimental and calculated pH dependence of the rate constant for association ...

For the bovine SOD above pH 11 and for the Photobacterium leiognathi SOD, larger electrostatic potential differences correspond to a 1 ln unit change in kinetic constant. This deviation in pH dependence can be attributed to the difficulty in calculating the pKas of amino acid residues of SOD. The pH-dependence of the rate constants of bovine SOD is shown in Fig. Fig.4b.4b. For example, assigning a low dielectric constant (ε = 4) to the protein interior resulted in lower pKa values of Lys and Arg residues and a too steep drop in the rate constant around pH 9, see Fig. Fig.4b.4b. Significant pH-dependent changes in kcat/Km are observed experimentally to occur at pH 11. This behavior can be reproduced when electrostatic potentials are calculated using the charges, assigned from pKa calculations with a protein dielectric constant of 78. Use of this high value of the protein dielectric in computations of the pKa values of these residues is expected to give the most accurate pKa values according to the methodology applied [28].

Molecular electrostatic potential differences correlate with substrate Km and kcat/Km rate constants for a set of triose phosphate isomerases (TPI) from different organisms

The reaction mechanism of TPI has been studied extensively [29] and the consistency of measured kinetic constants has been analyzed [30]. Here, we selected TPIs from 12 different organisms that have Km and kcat values in the BRENDA database [31]. In the database, some of the kinetic parameters are duplicated and some are inconsistent with each other. Therefore, before applying qPIPSA, the kinetic parameters were analyzed by referring to the original papers, rejecting values measured under very different conditions and favoring cases in which both kcat and Km originated from the same authors (see Table Table2).2). The results given here are for TPI protein structure models built using SwissModel and Modeller [32] with the "Turbo" modeling protocol (see Methods section). For the kcat/Km values of TPIs, the comparison region is centered on the oxygen atom of residue L230 in the active site of TPI. A systematic scanning of different regions (considering each accessible atom as a comparison region center) showed that the differences in electrostatic potential in this region most accurately described changes in the kinetic parameter kcat/Km (see below).

Table 2
Kinetic constants for triose phosphate isomerases (TPI) from 12 different organisms with glyceraldehyde-3-phosphate as substrate

For the kcat/Km parameter, we find that an increase of kcat/Km of 1 ln unit is related to a ca. 1.59 kcal/mol/e increase of average electrostatic potential, see Fig. 5a,b. The opposite dependence compared to AChE reflects the fact that for TPI the substrate has a negative -2e charge, whereas for AChE, TFK+ and ATCh have a +1e charge.

Figure 5
(a and b) Correlation between experimental ln(kcat/Km) and electrostatic potential differences for TPI and (c and d) prediction of kcat/Km for TPI for the substrate glyceraldehyde-3-phosphate. Correlations (a and b) are for differences in ln(kcat/Km) ...

This correlation can be used to predict enzyme kinetic parameters of triose phosphate isomerases. The kinetic parameters from species with the highest and lowest kcat/Km values were used to calculate and predict parameters for the remaining 10 species, see Fig. Fig.5c5c and and5d.5d. In general, a satisfying agreement between calculated and experimental values was obtained. The relative ordering of all species could be reproduced. Computation of the kcat/Km parameters for the remaining 10 TPIs shows that the TPIs (from rabbit and Giardia lamblia) appear to be outliers, see Fig. Fig.5c5c and and5d.5d. The predicted kcat/Km values for enzymes from these organisms were significantly larger than those measured experimentally. The correlations were better for the SwissModel models (Fig. (Fig.5b5b and and5d)5d) than for the Modeller "Turbo" models (Fig. (Fig.5a5a and and5c5c).

In a similar manner, enzymatic Km values can be predicted from a correlation between electrostatic potential differences and experimental Km values. In this case, the region around residue W168 was chosen to calculate potential differences as changes in the potential were found to describe changes in the Km value most accurately in this region (see below). A 0.85 kcal/mol/e increase in electrostatic potential results in a 1 ln unit decrease in Km value. (In the region around L230O, a 0.94 kcal/mol/e increase in potential results in a 1 ln unit decrease in Km value). Based on the two extrema, the Km values of the remaining species can be computed (Fig. (Fig.6).6). The relative ordering of Km values of all species could be reproduced.

Figure 6
Prediction of Km values for TPI for the substrate glyceraldehyde-3-phosphate. Predictions were made for 10 TPIs based on experimental values for 2 TPIs (from V. marinus (TPIS_VIBMA) and P. falciparum (TPIS_PLAFA). Two outliers (with errors greater than ...

In TPIs, the substrate has a net charge of opposite sign and twice the magnitude of the substrate of AchE. Therefore the correlation coefficient should be expected to have approximately half the magnitude and an inverted sign compared to the AchE case, in accord with its physical meaning, described in the Theory section. This is indeed the case for the correlation between the electrostatic potential differences and Km values.

Molecular electrostatic potential correlates with substrate Km and kcat/Km values for Class I Fructose 1,6-bisphosphate aldolase (FBA) when an appropriate conformation of the C-terminal part of the protein is used

Fructose-1,6-(bis)phosphate aldolase is a ubiquitous enzyme that catalyzes the reversible aldol cleavage of fructose-1,6-(bis)phosphate to glyceraldehyde-3-phosphate and glycerone phosphate. Class I aldolases of animals and higher plants use covalent catalysis through a Schiff-base intermediate. Class II aldolases of most bacteria and fungi require a divalent metal cation as a cofactor [33].

Kinetic data for 10 different class I fructose 1,6-bisphosphate aldolase enzymes and isozymes from 6 different species are available from the BRENDA database [31], see Table Table3.3. FBA was selected for study here as it is rather well characterized experimentally, with both kcat and Km data available for the 10 different aldolases. The comparison region was chosen to be a collection of points on the skin of the protein (see Methods section) within 10 Å of the active site of human aldolase A. In order to achieve an unbiased assignment of the active site, the center of this region was determined to be the most probable location of the active site pocket using the Putative Active Sites with Spheres (PASS) tool [34]. Other choices, such as the region around K146NZ or K229NZ, atoms of residues known to be involved in substrate binding and reaction, give similar results. The values of kcat/Km are highly correlated with electrostatic potentials in the active site. The correlation corresponds to a change of kcat/Km by 1 ln unit when the electrostatic potential changes by 0.21 kcal/mol/e, see Fig. Fig.7a.7a. This value is smaller than the 1.59 kcal/mol/e for TPI. This can be explained by considering that the substrate in this case is bisphosphate with 2 phosphate groups and a -4e charge at pH 7.

Table 3
Kinetic data for 10 different Class I Fructose-1,6-bisphosphate aldolases (FBA) with fructose-1,6-bisphosphate as substrate
Figure 7
Correlation between experimental ln(kcat/Km) for bisphosphate and electrostatic potential differences for different aldolases (a) and prediction of kcat/Km for bisphosphate and class I aldolase (b). Correlations (a) are for differences in ln(kcat/Km) ...

When making predictions for 8 of the aldolases based on known kcat/Km values for two aldolases, it is found that two proteins, ALF2_LAMJA and ALDOC_RAT, appear as outliers, see Fig. Fig.7b.7b. In both cases, the kcat/Km values are predicted to be more than 4 times lower than the experimental values. This deviation is outside the uncertainty of the qPIPSA method and points to either the need for additional experimental assays or to additional contributions to the catalysis of these two enzymes. Removing the outliers changes the R-coefficient of correlations between experimental and predicted ln(kcat/Km) from 0.63 to 0.87. The Km values for aldolases also show a detectable correlation with electrostatic potentials – 60% error defined by formula (3) and R-coefficient -0.59, see Fig. Fig.88.

Figure 8
Correlation between experimental ln(Km) for bisphosphate and class I aldolase electrostatic potential differences. Correlations are for differences in ln(Km) and electrostatic potential differences for each aldolase protein pair. All possible pairs of ...

Aldolases present a rather complicated case for the analysis based on the electrostatic potential in the active site region because it is known that the kinetic parameters of aldolases are modulated by a flexible C-terminal part as well as isozyme specific residues that are not located in the active site of the enzyme [35,36]. Nevertheless, kinetic parameters correlate with the interaction properties near the active site of the enzyme. The electrostatic potential changes here have a much larger impact on Km than in the other cases investigated in this manuscript: changes of electrostatic potential by 0.3 kcal/mol/e change Km by 1 ln unit. As was already mentioned, this can be attributed to a larger charge of the substrate in this case. At the same time, there is a possibility that putative conformational rearrangements, that are not included in our modeling, are accounted for by electrostatic potential changes implicitly – conformational changes could cause further changes in electrostatic potential and these changes could account for changes in Km directly by contributing to the interaction with the substrate. We also find that, in this case, the selection of a reference structure with an appropriate conformation is very important. The correlation was lost when we used the structure of the human aldolase A (muscle-type) with PDB code 1ALD instead of the structure of the rabbit muscle aldolase A (PDB code 1ZAI) as a template, even though these proteins have 98% sequence identity. The main difference between the two crystal structures is in the conformation of C-terminal residues: in 1ALD, the C-terminal forms part of the active site of the enzyme [37] whereas in 1ZAI, it does not. The critical involvement of the C-terminal loop in catalysis has been noted before. C-terminal truncated aldolases are practically catalytically inactive. See also reference [38].

A conservative comparative modeling protocol is required for correlating MIFs and kinetic parameters

The results described above for TPI are for protein structure models built using Modeller [32] with the "Turbo" modeling protocol (see Methods section). Other modeling protocols were tried and it was found that the modeling accuracy is important for establishing good correlations. We compared two protocols: the "Automodel" mode of Modeller with extensive optimization of the structures and the "Turbo" mode, in which only molecular mechanics minimization of the structures after building the initial homology model (based on a single template) is performed. There is a ca. 1 Å RMS deviation of Cα atoms between these models, but the side chain orientations are more variable in the "Automodel" models. The models showed different degrees of correlation of the electrostatic potential difference and ln(kcat/Km) values, when the region radius, R, was varied from 7.5 to 30 Å. The region itself is a collection of points on a "skin" of 3 Å thickness and accessible to a probe of 1, 2 or 3 Å radius within a distance R from the center, here atom L230O, see Fig. Fig.9a9a.

Figure 9
Correlation errors for ln(kcat/Km) for TPI for (a) different protein structural models and (b) different specifications of the MIF comparison region. (a) The relative error of predicted ln(kcat/Km) values of protein structural models generated by 3 different ...

When a probe radius, δ, of 1 Å was used to define the skin, the correlation error was roughly constant up to R = 18 Å and then abruptly increased for larger values of R. The use of a probe radius of 3 Å gave errors systematically increasing with R for "Turbo" models, and decreasing and reaching a minimum for "Automodel" models. The smallest correlation error, 17%, was obtained at a probe radius of 2 Å and "skin" thickness of 3 Å with "Turbo" models at R = 10 Å. "Automodel" models only gave a correlation error minimum of ca. 50% and this was obtained at R = 20 Å. This shows that the orientation of side chains is important for obtaining correlations. Inconsistent side-chain positioning, as in the "Automodel" case, reduces the possibilities for deriving a correlation between the structure and the kinetic parameters. Nevertheless, a weak correlation can still be identified by using a larger radius R for the comparison region so that errors due to side chain conformational differences are partly cancelled out.

We also tested the models obtained by modeling with SwissModel [39]. We were able to model all 12 TPIs using the the SwissModel web-server, although the web-server does not produce any model when the sequence identity is too low or modeling problems are encountered. The SwissModel models differ from the Modeller "Turbo" models by the same amount (in terms of RMSD of Cα atoms) as the Modeller "Automodel" models, but the SwissModel models have more ordered side chain conformations than the Modeller "Turbo" models, see Fig. Fig.10.10. The correlations derived from the SwissModel models were better than the correlations found for the Modeller "Turbo" models, although differences in RMSDs were very small (Fig. (Fig.1010).

Figure 10
Comparison of three different protein structure modeling protocols for modeling TPIs. The RMSD of atoms of charged side chains is plotted against the RMSD of Cα atoms of equivalent residues in all pairs of TPI models. The data for three different ...

Correlation of kinetic parameters depends on the comparison region and this dependence may give insights into determinants of kinetic parameters

In comparing MIFs and correlating their differences with kinetic parameter differences, a comparison region must be chosen. An appropriate size, shape and location are not obvious, although one can expect that the comparison region should be near the active site of the enzyme. We already noticed that the comparison region should neither be too small nor too large (see Materials and Methods section for a detailed description), a region with a radius of approximately 10 Å is optimal (see Fig. Fig.99).

In the case of AChE, comparisons over a single region in the gorge near the buried active site result in correlations for the inhibitor association rate constant as well as the substrate kcat/Km and Km parameters. Moving the comparison region to the active site results in slightly poorer correlations. Moving the comparison region to the entrance of the gorge results in significantly poorer correlations for the majority of cases. This means that the electrostatic potential closer to the active site is of more importance for the kinetic parameter values, i.e. rate-determining. At the same time, the contribution from the surface residues to the potential in this region does not fully account for the contribution of these residues to the inhibitor association rate constant (Fig. (Fig.2a).2a). This means that the interaction field in this region does not fully describe the kinetics. However, in this case a single region still describes kinetic parameters quite accurately.

In the case of TPI, the best correlations are obtained for different comparison regions for the two different kinetic parameters, Km and kcat/Km. The region best describing kcat/Km is located near the active site (Fig. (Fig.11),11), while variations in Km value are better described by the region around the flexible loop of TPI that can close over the active site for the reaction. The hinge residues have been shown by mutagenesis to be important for the reaction catalyzed by TPI [40,41].

Figure 11
Identification of regions on the surface of TPI relevant for kcat/Km and Km kinetic parameters. By scanning patches and residues on the protein surface, calculating the correlation between electrostatic potential differences and the respective kinetic ...

It can be expected that the same region is responsible for parameters correlated with each other. The appearance of two different regions responsible for 2 different kinetic parameters is expected when the 2 different kinetic parameters are not correlated with each other, as in case of TPI, because the interaction field changes in a single region can describe only one set of parameters. The region responsible for the TPI parameter kcat/Km can be interpreted as contributing to the interaction between the substrate and the enzyme, because it is located near the substrate binding position. The region responsible for Km is close to the active site, but one cannot expect that the substrate interacts with the enzyme there, because the region is on the other side of the flexible loop from the active site. One can assume that the interaction field variations define Km variations indirectly. It is the variation of physico-chemical properties of the residues at the beginning of the flexible loop that changes Km values, but we see these variations as variations of the electrostatic potential around W168.

Conclusion

We have shown that protein interaction field differences can correlate with changes in the kinetic parameters of enzymes. The application possibilities of this observation are two-fold. First, the correlation method can be used to check the consistency of kinetic measurements with available structural data. In the case that the predictions are significantly different from measured parameters, significant structural changes can be expected or the measurement conditions are not consistent with the structure of the enzyme. Second, prediction of kinetic parameters is possible when data consistency is present – given two enzymes with known kinetic parameters and structure, one can derive a correlation and make predictions for the rate constants of a similar enzyme with known structure. Two is the minimum number of known parameters required for application of the method. For error estimates, more known kinetic parameters are needed and the more measured kinetic parameters are used, the more reliable the predictions will be.

An important area of application is the use of protein three-dimensional models to bring experimental measurements into consistency with each other. Measurements of kinetic constants are frequently done under different conditions and often result in different values for the same parameter. Here, the structural properties can be used as a reference point for rationalising two different measurements.

The generality of the qPIPSA method is that it uses a concept of similarity that does not require a priori knowledge of the specific mechanism of the enzymatic reaction. Although only consideration of the detailed mechanism can permit quantification of all the contributions, general structural differences can explain the differences between the enzymes resulting in their different kinetic properties. This is analogous to the case of protein association, in which relative association rates of protein mutants are described well by differences in interaction energies calculated in a simple way [42], although the association process itself may be changed significantly by mutations [43].

The consistency of kinetic measurement conditions and structural data is dependent on the accuracy of structural modeling. The requirement for protein structural model accuracy is in general high. It varies according to the type of parameter. kon or kcat/Km values, for which long-range interactions are important may be less sensitive to structural details than kcat which is defined by short-range interactions. In our study, we avoided the problem of rotating side-chains by forcing equivalent side-chains to have the same conformation. This approach may overlook parameter changes due to side-chain conformational changes, but as shown here, in many cases it allows significant correlations between structural and kinetic parameters to be detected. Improvements in homology modeling methods [44] may result in improvements in this type of prediction.

The minimum requirements for performing qPIPSA are (i) that a structural model of the enzyme of adequate quality for computation of the electrostatic potential in the region of the active site must be available or modellable, and (ii) that enzymatic kinetic parameters for related enzymes having the same reaction mechanism must be available. According to our estimates, a significant fraction of characterized enzymes satisfy these requirements and as such can be investigated by qPIPSA. Factors influencing the validity of qPIPSA include the following. (i) The relevance of the protein structures analysed for the rate-determining catalytic step. Proteins may adopt multiple conformations. Sometimes it may be necessary to do calculations with more than one conformation, e.g. with open and closed forms of an enzyme active site. If one of these gives MIFs that correlate better with known kinetic parameters, this provides some mechanistic information about the determinants of the kinetic parameter. (ii) Similarly, the region over which to compare the MIFs should be chosen. Calculations can be done for a number of regions. For TPIs, the best correlation with kinetic parameters was obtained for different regions for kcat/Km values and Km values. This again provides some mechanistic information on the parameter determinants. (iii) The method is most suitable when the rate-determining step is mechanistically the same across the set of protein structures compared. An outlier might be mechanistically different, but if there is wide mechanistic variation in the dataset, the comparative approach cannot be expected to work. (iv) Molecular dynamics are not currently considered in the approach and if they alter the protein structures in different ways across the dataset they will adversely affect the value of the MIF comparison.

In summary, we have described qPIPSA and demonstrated its validity for establishing correlations between kinetic parameters and MIFs, as well as estimating kinetic parameters using MIFs computed from enzyme structures. qPIPSA can be a useful tool for parameter estimation in biochemical network modelling in systems biology applications as well as in investigating enzymatic structure-function relationships and enzyme mechanisms.

Methods

Modeling of protein structures

All protein structures were modeled by comparative modelling techniques with SwissModel [39]. In addition, Modeller [32] was used to model TPI structures. Comparative modeling techniques are suitable as the level of sequence identity between the template protein with experimentally determined structure and the modeled protein was rather high, being greater than 40% for all cases studied.

One X-ray crystal structure was chosen as the template for each enzyme family (with PDB and subunit identifiers as follows: 1mahA for AChE, 1cbjA for SOD, 1r2rA for TPI, 1zaiA for FBA). TPI and SOD are homodimers, but monomeric models were used here, as there is little interweaving of the monomer structures. Only one monomer of FBA was used for the same reason, although the majority of aldolases are homotetramers.

For a given enzyme family, all comparative models were built from one single template crystal structure even when other crystal structures were available with higher sequence similarity to the modeled protein. This was done to ensure that the differences between the models of different proteins of one enzyme family were due to differences in sequence only and not influenced by differences in crystallization conditions of different template crystal structures. Such differences can, for example, result in different rotamers of flexible side chains being seen in different structures corresponding to the same protein sequence.

For Modeller, as well as using the default mode "Automodel", we used a modeling procedure called "Turbo", to avoid excessive optimization of models. Excessive optimization and MD refinement can result in differences between models of different proteins in one enzyme family that do not reflect the sequence differences.

The "Automodel" class in Modeller8v.1 [32] provides a convenient set up of default parameters for the generation of homology models. By default, protein structural models are energy minimized and then subject to a molecular dynamics and simulated annealing procedure to overcome bad contacts or local minima. A detailed description of the procedure and settings can be found in the Modeller documentation http://salilab.org/modeller/manual/. The CHARMM22 [45] force field for proteins is used with modified dihedral parameters. Energy minimization is done using a conjugate gradients optimisation method for a given maximum number of iterations. The minimal atomic shift for the optimisation convergence test is 0.010 Å. The molecular dynamics integrator uses the Verlet algorithm; the time step in MD simulations is set to 4 fs.

The default (referred to here as "Automodel" procedure) settings correspond to a minimization for a maximum of 200 steps, followed by a "very_fast" MD and simulated annealing simulation. Initial models are randomised in their xyz coordinates by adding a random number between -4.0 and +4.0 Å to the initial coordinates. MD equilibration is performed at 150, 400 and 1000 K for 50 iterations each. Simulated annealing is done at 1000, 800, 500 and 300 K for 300 iterations at each temperature.

The simplified protocol (referred to here as the "Turbo" procedure) uses the "automodel.very_fast" pre-defined settings. They differ from the default in that initial coordinates are not randomised; energy minimization is done for a maximum of 50 steps with no subsequent MD and simulated annealing runs. In both cases, the quality of the generated models was checked with WHATCHECK [46] to remove bad models.

We applied different additional protein structure quality verification tools (PROSA [47], Verify3D [48] and ERRAT [49]). These gave little preference to any one of the three modelling procedures.

Analysis of molecular interaction fields

The PIPSA (Protein Interaction Property Similarity Analysis) [13] software (version 2 http://projects.villa-bosch.de/mcm/software/pipsa with minor modifications) was used to quantify MIF differences. The MIF used was the molecular electrostatic potential. Other MIFs for the interaction of different chemical fragments or functional groups with macromolecules (these can, for example, be computed with the GRID program [50] for a range of probes such as carboxylate, amino and hydrophobic probes), were not used in this work.

PIPSA permits the comparison of the interaction fields for a superimposed protein pair in a region defined by the intersection of the "skins" of these two proteins. The "skin" is defined as a region outside the protein surface, accessible to a probe of radius δ, and inside the surface accessible to a probe of radius δ + σ; the "skin" thus has a thickness σ. The comparison region can be restricted to be within a specified distance from a given point, for example, a chosen atomic center. Using PIPSA, we have previously carried out large scale classification of Plekstrin Homology (PH) domains [12], WW domains [8] and E2 ubiquitin conjugating domains [14] by electrostatic similarity to infer binding characteristics. We have also correlated electron transfer rates with the similarity of electrostatic potentials for a set of plastocyanin mutants [20]. In these studies, a probe of radius ψ = 3 Å and a "skin" of thickness σ = 4 Å were used. In the present study, a probe of radius δ = 2 Å and a "skin" of thickness σ = 3 Å were used. These values are smaller than those used previously because the kinetic parameters of enzymes can be influenced by the interaction fields in the active site close to the enzyme surface. The use of a skin closer to the surface meant that it was important to derive models with consistent side chain orientations (see above).

For each protein structure, the molecular electrostatic potential was computed as follows. Polar hydrogen atoms were added using Whatif [46] and electrostatic potentials were calculated by numerically solving the linearized Poisson-Boltzmann equation using the UHBD program [51] with atomic radii and partial atomic charges assigned from the OPLS parameter set [52]. The grid spacing was set to 1 Å and grid was dimensioned to 1103 points. The dielectric surface of the protein was defined as the molecular surface accessible to a spherical water probe of radius 1.4 Å. The relative dielectric constant of the protein interior was set to 2, the dielectric constant of the solvent was 80. The ionic strength of the solvent was 0 mM in the case of AChE, 100 mM for TPI, 150 mM for FBA and variable for SOD in accord with the corresponding experimental conditions.

In all cases, we used the following protocol to select the MIF comparison region:

1. The skin is assigned a thickness σ of 3Å extending from the probe accessible surface around the protein computed with a probe of radius (δ) 2 Å.

2. The region is assigned a radius of 10Å. If there are fewer than 50 points for comparison in this region, the radius is incremented in 2 Å intervals until enough points are included for reliable calculations.

3. The center of the region is assigned in the active or binding site. How this is chosen depends on the information available on the enzyme under study. If a structure is known of an enzyme-ligand complex, the region can be centered on the ligand. If the catalytic residues are known, the center can be one of these or their geometric center. The comparison region should cover the region in which the ligand interacts with the enzyme, but may also be chosen to include a region relevant for ligand access such as the gorge in AChE.

For AchE, the MIF comparison region was selected to be within 10Å of a point near His447 in the substrate active site gorge [25] (namely, the mid-point between the atomic coordinates of H447ND1 and Y124OH in the X-ray crystal structure with PDB identifier 1mah, subunit A). The spherical region thereby included the active site, the gorge and the peripheral substrate binding site at the gorge entrance. There were about 50 grid points in the 3 Å thick skin extending from the surface accessible to a 2 Å radius probe. Residues Glu202 was protonated and His447 was doubly protonated, as described previously [53].

SOD structures were modeled based on the structure of Bovine superoxide dismutase [54]. The pH dependence of protein charges was modeled by calculating the pKa values of ionisable sites using a Poisson-Boltzmann electrostatics method [28]. Calculations were done for assignments of high (78) and low (4) values of the dielectric constant of the protein. The charges of the Cu and Zn ions and their ligands were assigned as described previously [27] and these charges were not changed with pH. The electrostatic potentials were compared in the "skin" region within 10 Å of the catalytic copper ion. The comparison region in TPI and FBA cases has the same radius with the centers assigned as described in the text.

Abbreviations

AChE – Acetylcholinesterase

ATCh – Acetylthiocholine

FBA – Class I fructose-1,6-bisphosphate aldolase

Km – Michaelis-Menten constant

kcat – Turnover number

ln – Natural logarithm

LTO – Leave-two-out cross-validation

MIF – Molecular interaction field

PIPSA – Protein Interaction Property Similarity Analysis

PLS – Partial Least Squares

RMSD – Root mean square deviation

QSAR – Quantitative structure activity relationship

SOD – Superoxide dismutase

TFK+ – m-trimethylammoniotrifluoroacetophenone

TPI – Triose phosphate isomerase

Authors' contributions

RRG carried out the calculations and modeling; MS built the Modeller structural models and assessed the accuracy of modeled structures; RRG, MS and RCW participated in the design and coordination of the study, the analysis of results and the writing of the manuscript. All the authors have read and approved the final version of the manuscript.

Acknowledgements

This work was supported by the BMBF Systems Biology Initiative HepatoSys Grant nos. 0313076 and 0313078C, the Klaus Tschira Foundation and the Heidelberg Center for Modelling and Simulation in the Biosciences (BIOMS).

References

  • Kitano H. Systems biology: Brief overview. Science. 2002;295:1662–1664. doi: 10.1126/science.1069492. [PubMed] [Cross Ref]
  • Gabdoulline RR, Kummer U, Olsen LF, Wade RC. Concerted simulations reveal how peroxidase compound III formation results in cellular oscillations. Biophys J. 2003;85:1421–1428. [PMC free article] [PubMed]
  • Kettner C, Hicks MG. Chaos in the world of enzymes - How valid is functional characterization without methodological experimental data? In: Hicks MG and Kettner C, Logos Verlag; Berlin, editor. Experimental standard conditions of enzyme characterization, Proceedings of the 1st International Beilstein Symposium. Ruedesheim/Rhein; 2003. pp. 1–16.
  • Garcia-Viloca M, Gao J, Karplus M, Truhlar DG. How enzymes work: Analysis by modern reaction rate theory and computer simulations. Science. 2004;303:186–195. doi: 10.1126/science.1088172. [PubMed] [Cross Ref]
  • Gabdoulline RR, Wade RC. Biomolecular diffusional association. Curr Opin Struct Biol. 2002;12:204–213. doi: 10.1016/S0959-440X(02)00311-1. [PubMed] [Cross Ref]
  • Wade RC. Brownian dynamics simulations of enzyme-substrate encounter. Biochem Soc Trans. 1996;24:254–259. [PubMed]
  • Wade RC. Calculation and Application of Molecular Interaction Fields. In: Cruciani G, editor. Molecular Interaction Fields Applications in Drug Discovery and ADME Prediction. Vol. 2. Weinheim, Wiley-VCH; 2005. pp. 27–42.
  • Schleinkofer K, Wiedemann U, Otte L, Wang T, Krause G, Oschkinat H, Wade RC. Comparative structural and energetic analysis of WW domain-peptide Interactions. J Mol Biol. 2004;344:865–881. doi: 10.1016/j.jmb.2004.09.063. [PubMed] [Cross Ref]
  • Cramer RD, Patterson DE, Bunce JD. Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc. 1988;110:5959–5967. doi: 10.1021/ja00226a005. [PubMed] [Cross Ref]
  • Cruciani G. Molecular Interaction Fields. In: Mannhold R, Kubinyi H and Folkers G, editor. Methods and Principles in Medicinal Chemistry. Vol. 27. Weinheim, WILEY-VCH; 2006.
  • Kmunicek J, Hynkova K, Jedlicka T, Nagata Y, Negri A, Gago F, Wade RC, Damborsky J. Quantitative Analysis of Substrate Specificity of Haloalkane Dehalogenase LinB from Sphingomonas paucimobilis UT26. Biochemistry. 2005;44:3390–3401. doi: 10.1021/bi047912o. [PubMed] [Cross Ref]
  • Blomberg N, Gabdoulline RR, Nilges M, Wade RC. Classification of protein sequences by homology modeling and quantitative analysis of electrostatic similarity. Proteins. 1999;37:379–387. doi: 10.1002/(SICI)1097-0134(19991115)37:3<379::AID-PROT6>3.0.CO;2-K. [PubMed] [Cross Ref]
  • Wade RC, Gabdoulline RR, Rienzo FD. Protein interaction property similarity analysis. Intl J Quant Chem. 2001;83:122–127. doi: 10.1002/qua.1204. [Cross Ref]
  • Winn PJ, Religa TL, Battey JD, Banerjee A, Wade RC. Determinants of Functionality in the Ubiquitin Conjugating Enzyme Family. Structure. 2004;12:1563–1574. doi: 10.1016/j.str.2004.06.017. [PubMed] [Cross Ref]
  • De Rienzo F, Gabdoulline RR, Menziani MC, Wade RC. Blue Copper Proteins: A Comparative Analysis of their Molecular Interaction Properties. Protein Sci. 2000;9:1439–1454. [PMC free article] [PubMed]
  • Warshel A. Energetics of enzyme catalysis. Proc Natl Acad Sci USA. 1978;75:5250–5254. doi: 10.1073/pnas.75.11.5250. [PMC free article] [PubMed] [Cross Ref]
  • Stroppolo ME, Falconi M, Caccuri AM, Desideri A. Superefficient enzymes. Cell Mol Life Sci. 2001;58:1451–1460. doi: 10.1007/PL00000788. [PubMed] [Cross Ref]
  • Hodgkin EE, Richards WG. Molecular similarity based on electrostatic potential and electric field. Int J Quant Chem Quant Biol Symp. 1987;14:105–110. doi: 10.1002/qua.560320814. [Cross Ref]
  • Stein M, Gabdoulline RR, Besson B, Wade RC. Proceedings of the 2nd Beilstein Symposium on Experimental Standard Conditions on Enzyme Characterization. Logos Verlag, Berlin; 2007. The estimation of kinetic parameters in Systems Biology by comparing molecular interaction fields of enzymes; pp. 237–53.
  • De Rienzo F, Gabdoulline RR, Menziani MC, DeBenedetti PG, Wade RC. Electrostatic and Brownian dynamics simulation analysis of plastocyanin and cytochrome f. Biophys J. 2001;81:3090–3104. [PMC free article] [PubMed]
  • Zhou HX, Wong KY, Vijayakumar M. Design of fast enzymes by optimizing interaction potential in active site. Proc Natl Acad Sci USA. 1997;94:12372–12377. doi: 10.1073/pnas.94.23.12372. [PMC free article] [PubMed] [Cross Ref]
  • Radic Z, Kirchhoff PD, Quinn DM, McCammon JA, Taylor P. Electrostatic influence on the kinetics of ligand binding to acetylcholinesterase. Distinctions between active center ligands and fasciculin. J Biol Chem. 1997;272:23265–23277. doi: 10.1074/jbc.272.37.23265. [PubMed] [Cross Ref]
  • Tara S, Elcock AH, Kirchhoff PD, Briggs JM, Radic Z, Taylor P, McCammon JA. Rapid binding of a cationic active site inhibitor to wild type and mutant mouse acetylcholinesterase: Brownian dynamics simulation including diffusion in the active site gorge. Biopolymers. 1998;46:465–474. doi: 10.1002/(SICI)1097-0282(199812)46:7<465::AID-BIP4>3.0.CO;2-Y. [PubMed] [Cross Ref]
  • Zhou HX, Briggs JM, Tara S, McCammon JA. Correlation between rate of enzyme-substrate diffusional encounter and average Boltzmann factor around active site. Biopolymers. 1998;45:355–360. doi: 10.1002/(SICI)1097-0282(19980415)45:5<355::AID-BIP4>3.0.CO;2-K. [PubMed] [Cross Ref]
  • Botti SA, Felder CE, Lifson S, Sussman JL, Silman I. A modular treatment of molecular traffic through the active site of cholinesterase. Biophys J. 1999;77:2430–2450. [PMC free article] [PubMed]
  • Argese E, Girotto R, Orsega EF. Comparative kinetic srudy between native and chemically modified Cu, Zn superoxide dismutases. Biochem J. 1993;292:451–455. [PMC free article] [PubMed]
  • Wade RC, Gabdoulline RR, Luedemann S, Lounnas V. Electrostatic steering and ionic tethering in enzyme-ligand binding: Insights from simulations. Proc Natl Acad Sci USA. 1998;95:5942–5949. doi: 10.1073/pnas.95.11.5942. [PMC free article] [PubMed] [Cross Ref]
  • Demchuk E, Wade RC. Improving the continuum dielectric approach to calculating pKas of ionizable groups in proteins. J Phys Chem. 1996;100:17373–17387. doi: 10.1021/jp960111d. [Cross Ref]
  • Knowles J. Enzyme catalysis: not different, just better. Nature. 1991;350:121–124. doi: 10.1038/350121a0. [PubMed] [Cross Ref]
  • Wade RC, Gabdoulline RR, Luty B. Species dependence of enzyme-substrate encounter rates for triose phosphate isomerases. Proteins. 1998;31:406–416. doi: 10.1002/(SICI)1097-0134(19980601)31:4<406::AID-PROT7>3.0.CO;2-F. [PubMed] [Cross Ref]
  • Schomburg I, Chang A, Ebeling C, Gremse M, Heldt C, Huhn G, Schomburg D. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res. 2004;32:D431–3. doi: 10.1093/nar/gkh081. [PMC free article] [PubMed] [Cross Ref]
  • Eswar N, Marti-Renom MA, Webb B, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative Protein Structure Modeling with MODELLER. Current Protocols in Bioinformatics, John Wiley & Sons, Inc. 2006;Supplement 15:5.6.1–5.6.30.
  • Rutter WJ. Evolution of aldolase. Fed Proc. 1964;23:1248 1257. [PubMed]
  • Brady GP, Stouten PFW. Fast prediction and visualization of protein binding pockets with PASS. Journal of Computer-Aided Molecular Design. 2000;14:383–401. doi: 10.1023/A:1008124202956. [PubMed] [Cross Ref]
  • Pezza JA, Choi KH, Berardini TZ, Beernink PT, Allen KN, Tolan DR. Spatial clustering of isozyme-specific residues reveals unlikely determinants of isozyme specificity in fructose-1,6-biphosphate aldolase. Journal of Biological Chemistry. 2003;278:17307–17313. doi: 10.1074/jbc.M209185200. [PubMed] [Cross Ref]
  • Arakaki TL, Pezza JA, Cronin MA, Hopkins CE, Zimmer DB, Tolan DR, Allen KN. Structure of human brain fructose 1,6-(bis)phosphate aldolase: Linking isozyme structure with function. Protein Sci. 2004;13:3077–3084. doi: 10.1110/ps.04915904. [PMC free article] [PubMed] [Cross Ref]
  • Gamblin SJ, Davies GJ, Grimes JM, Jackson RM, Littlechild JA, Watson HC. Activity and specificity of human aldolases. J Mol Biol. 1991;219:573–576. doi: 10.1016/0022-2836(91)90650-U. [PubMed] [Cross Ref]
  • Blom N, Sygusch J. Product binding and role of the C-terminal region in class I D-fructose 1,6-bisphosphate aldolase. Nat Struct Biol. 1997;4:36–39. doi: 10.1038/nsb0197-36. [PubMed] [Cross Ref]
  • Guex N, Peitsch MC. Swiss-Model and the Swiss-Pdb viewer: An environment for comparative protein modeling. Electrophoresis. 1997;18:2614–2723. doi: 10.1002/elps.1150181505. [PubMed] [Cross Ref]
  • Pompliano DL, Peyman A, Knowles JR. Stabilization of a reaction intermediate as a catalytic device: Definition of the functional role of the flexible loop in trosephosphate isomerase. Biochemistry. 1990;29:3186–3194. doi: 10.1021/bi00465a005. [PubMed] [Cross Ref]
  • Xiang J, Jung JY, Sampson NS. Entropy effects on protein hinges: the reaction catalyzed by triosephosphate isomerase. Biochemistry. 2004;43:11436–11445. doi: 10.1021/bi049208d. [PubMed] [Cross Ref]
  • Selzer T, Schreiber G. Predicting the rate enhancement of protein complex formation from the electrostatic energy of interaction. J Mol Biol. 1999;287:409–419. doi: 10.1006/jmbi.1999.2615. [PubMed] [Cross Ref]
  • Spaar A, Dammer C, Gabdoulline RR, Wade RC, Helms V. Diffusional encounter of barnase and barstar. Biophys J. 2006;90:1913–1924. doi: 10.1529/biophysj.105.075507. [PMC free article] [PubMed] [Cross Ref]
  • Wallner B, Elofsson A. All are not equal: A benchmark of different homology modeling programs. Protein Sci. 2005;14:1315–1327. doi: 10.1110/ps.041253405. [PMC free article] [PubMed] [Cross Ref]
  • Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan A, Karplus M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J Comp Chem. 1983;4:187–217. doi: 10.1002/jcc.540040211. [Cross Ref]
  • Vriend G. WHAT IF: A molecular modeling and drug design program. J Mol Graph. 1990;8:52–56. doi: 10.1016/0263-7855(90)80070-V. [PubMed] [Cross Ref]
  • Sippl MJ. Recognition of errors in three-dimensional structures of proteins. Proteins. 1993;17:355–362. doi: 10.1002/prot.340170404. [PubMed] [Cross Ref]
  • Eisenberg D, Luthy R, Bowie JU. VERIFY3D: assessment of protein models with three-dimensional profiles. Methods Enzymol. 1997;277:396–404. [PubMed]
  • Colovos C, Yeates TO. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Sci. 1993;2:1511–1519. [PMC free article] [PubMed]
  • Goodford PJ. A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem. 1985;28:849–857. doi: 10.1021/jm00145a002. [PubMed] [Cross Ref]
  • Madura JD, Briggs JM, Wade RC, Davis ME, Luty BA, Ilin A, Antosiewicz J, Gilson MK, Bagheri B, Scott LR, McCammon JA. Electrostatics and diffusion of molecules in solution: Simulations with the University of Houston Brownian dynamics program. Comp Phys Comm. 1995;91:57–95. doi: 10.1016/0010-4655(95)00043-F. [Cross Ref]
  • Jorgensen WL, Tirado-Rives J. The OPLS potential function for proteins: energy minimization for crystals of cyclic peptides and crambin. J Am Chem Soc. 1988;110:1657–1666.
  • Elcock AH, Gabdoulline RR, Wade RC, McCammon JA. Computer simulation of protein-protein association kinetics: acetylcholinesterase-fasciculin. J Mol Biol. 1999;291:149–162. doi: 10.1006/jmbi.1999.2919. [PubMed] [Cross Ref]
  • Hough MA, Hasnain SS. Crystallographic structures of bovine copper-zinc superoxide dismutase reveal asymmetry in two subunits: functionally important three and five coordinate copper sites captured in the same crystal. J Mol Biol. 1999;287:579–592. doi: 10.1006/jmbi.1999.2610. [PubMed] [Cross Ref]
  • Stroppolo ME, Sette M, O'Neill P, Polizio F, Cambria MT, Desideri A. Cu,Zn superoxide dismutase from Photobacterium leiognathi is an hyperefficient enzyme. Biochemistry. 1998;37:12287–12292. doi: 10.1021/bi980563b. [PubMed] [Cross Ref]
  • Maithal K, Ravindra G, Nagaraj G, Kumar-Singh S, Balaram H, Balaram P. Subunit interface mutation disrupting an aromatic cluster in Plasmodium falciparum triosephosphate isomerase: effect on dimer stability. Protein Eng. 2002;15:575–584. doi: 10.1093/protein/15.7.575. [PubMed] [Cross Ref]
  • Zomosa-Signoret V, Hernandez-Alcantara G, Reyes-Vivas H, Martinez-Martinez E, Garza-Ramos G, Peez-Montfort R, Gomez-Puyou MT, Gomez-Puyou A. Control of the reactivation kinetics of homodimeric triosephosphate isomerase from unfolded monomers. Biochemistry. 2003;42:3311–3318. doi: 10.1021/bi0206560. [PubMed] [Cross Ref]
  • Straus D, Raines R, Kawashima E, Knowles JR, Gilber W. Biochemistry Active site of triosephosphate isomerase: In vitro mutagenesis and characterization of an altered enzyme . Proc Natl Acad Sci USA. 1985 ;82:2272–2276. doi: 10.1073/pnas.82.8.2272. [PMC free article] [PubMed] [Cross Ref]
  • Lambeir AM, Opperdoes FR, Wierenga RK. Kinetic properties of triose-phosphate isomerase from Trypanosoma brucei brucei A comparison with the rabbit muscle and yeast enzymes. Eur J Biochem. 1987;168:69–74. doi: 10.1111/j.1432-1033.1987.tb13388.x. [PubMed] [Cross Ref]
  • Williams JC, Zeelen JP, Neubauer G, Vriend G, Backmann J, Michels PAM, Lambeir AM, Wierenga RK. Structural and mutagenesis studies of leishmania triosephosphate isomerase: a point mutation can convert a mesophilic enzyme into a superstable enzyme without losing catalytic power . Protein Eng. 1999 ;12:243–250. doi: 10.1093/protein/12.3.243. [PubMed] [Cross Ref]
  • Alvarez M, Zeelen JP, Mainfroid V, Rentier-Delrue F, Martial JA, Wyns L, Wierenga RK, Maes D. Triose-phosphate isomerase (TIM) of the psychrophilic bacterium Vibrio marinus: Kinetic and structural properties. J Biol Chem. 1998;273:2199–2206. doi: 10.1074/jbc.273.4.2199. [PubMed] [Cross Ref]
  • Mainfroid V, Terpstra P, Beauregard M, Frere JM, Mande SC, Hol WGJ, Martial JA, Goraj K. Three hTIM mutants that provide new insights on why TIM is a dimer . J Mol Biol. 1996;257:441–456. doi: 10.1006/jmbi.1996.0174. [PubMed] [Cross Ref]
  • Lopez-Velazquez G, Molina-Ortiz D, Cabrera N, Hernandez-Alcantara G, Peon-Peralta J, Yepez-Mulia L, Perez-Montfort R, Reyes-Vivas H. An unusual triosephosphate isomerase from the early divergent eukaryote Giardia lamblia. Proteins: Str Function Bioinformatics. 2004;55:824–834. doi: 10.1002/prot.20097. [PubMed] [Cross Ref]
  • Tang GL, Wang YF, Bao JS, Chen HB. Overexpression in Escherichia coli and characterization of the chloroplast triosephosphate isomerase from Spinach . Protein Expr Pur. 1999;16:432–439. doi: 10.1006/prep.1999.1087. [PubMed] [Cross Ref]
  • Callens M, Kuntz DA, Opperdoes FR. Kinetic properties of fructose bisphosphate aldolase from rabbit muscle and Staphylococcus aureus. Mol Biochem Parasitol. 1991;47:1–10. doi: 10.1016/0166-6851(91)90142-S. [PubMed] [Cross Ref]
  • Zhang R, Kusakabe T, Iwanaga N, Sugimoto Y, Kondo K, Takasaki Y, Imai T, Yoshida M, Hori K. Lamprey fructose-1,6-bisphosphate aldolase: characterization of the muscle-type and non-muscle-type isozymes. Arch Biochem Biophys. 1997;341:170–176. doi: 10.1006/abbi.1997.9918. [PubMed] [Cross Ref]
  • de Walque S, Opperdoes FR, Michels PAM. Cloning and characterization of Leishmania mexicana fructose-1,6-bisphosphate aldolase. Mol Biochem Parasitol. 1999;103:279–283. doi: 10.1016/S0166-6851(99)00140-1. [PubMed] [Cross Ref]
  • Kusakabe T, Motoki K, Hori K. Human aldolase C: characterization of the recombinant enzyme expressed in Escherichia coli. J Biochem. 1994;115:1172–1177. [PubMed]

Articles from BMC Bioinformatics are provided here courtesy of BioMed Central

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...