- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

# Coarse-Grained Biomolecular Simulation with REACH: Realistic Extension Algorithm via Covariance Hessian

^{*}Center for Molecular Biophysics, University of Tennessee/Oak Ridge National Laboratory, Oak Ridge, Tennessee; and

^{†}Computational Molecular Biophysics, Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg, Heidelberg, Germany

## Abstract

Coarse-graining of protein interactions provides a means of simulating large biological systems. Here, a coarse-graining method, REACH, is introduced, in which the force constants of a residue-scale elastic network model are calculated from the variance-covariance matrix obtained from atomistic molecular dynamics (MD) simulation. In test calculations, the C_{α}-atoms variance-covariance matrices are calculated from the ensembles of 1-ns atomistic MD trajectories in monomeric and dimeric myoglobin, and used to derive coarse-grained force constants for the local and nonbonded interactions. Construction of analytical model functions of the distance-dependence of the interresidue force constants allows rapid calculation of the REACH normal modes. The model force constants from monomeric and dimeric myoglobin are found to be similar in magnitude to each other. The MD intra- and intermolecular mean-square fluctuations and the vibrational density of states are well reproduced by the residue-scale REACH normal modes without requiring rescaling of the force constant parameters. The temperature-dependence of the myoglobin REACH force constants reveals that the dynamical transition in protein internal fluctuations arises principally from softening of the elasticity in the nonlocal interactions. The REACH method is found to be a reliable way of determining spatiotemporal protein motion without the need for expensive computations of long atomistic MD simulations.

## INTRODUCTION

Internal dynamics is essential for protein function, including ligand binding to macromolecules and mechanical structural change (1–7). Among the theoretical methods used to determine protein dynamics at atomic detail are molecular dynamics (MD) simulation using an anharmonic molecular mechanics potential function, and normal mode analysis (NMA), in which the potential energy is approximated as harmonic, leading to representation of the motions as a combination of vibrational modes (8–10).

Use of the harmonic approximation considerably simplifies the potential energy surface, and NMA has thus been applied to a wide range of biological macromolecules (e.g., 11–17). However, the difficulty remains of high computational costs for large systems. To overcome this difficulty, a further simplification of NMA, the elastic network model (ENM), has been proposed and applied in various studies such as structural changes of viruses (18), the dynamical behavior of GroEL-GroES complex on ATP binding (19), the motions of domain-swapped proteins (20), a database analysis of structural changes on ligand binding (21), and structural rearrangement on protein:protein interaction (22). In ENM, the protein molecule is represented by point residues centered, for example, at the C_{α} atoms, and connected by harmonic springs (23–27). ENM enables collective vibrational motions to be rapidly calculated. Its simplest form uses one force-constant parameter, assuming all the force constants between residue pairs within a cutoff distance, *r*_{cut}, to have the same value, *k*_{c}. In an extended form, the additional force-constant parameters for the interactions within *α*-helices or the intradomain interaction were also introduced (28,29). The magnitude of *k*_{c} can be determined by comparing the atomic fluctuations, the N-H bond order parameter (30,31), or the coordinate probability distribution (28), from the elastic network normal modes with those from experiment or MD simulation. In more sophisticated approaches, coarse-grained force fields have been determined by iteratively matching the internal coordinate fluctuations (32) or the forces of the coarse-grained sites (33) to all-atom MD. In these methods, therefore, *k*_{c} is determined a posteriori and ENM in this form cannot predict the magnitude of protein fluctuations. Furthermore, *k*_{c} has been found to be strongly dependent on *r*_{cut} (23).

To make ENM more widely applicable, in this study a new method is introduced, REACH (Realistic Extension Algorithm via Covariance Hessian), in which the residue interaction force constants are obtained directly from the atomic-detail variance-covariance matrix calculated using MD simulation. In this way, physically based atomic MD force fields can be projected onto interresidue force constants.

The methodology derived here calculates the REACH force constants by relating the harmonic-approximated potential energy of the ENM to the Hessian (second-derivative) matrix and then to the variance-covariance matrix. As a test case, MD trajectories of myoglobin are applied to calculate REACH force constants for this protein. Associated model functions are also constructed, allowing analytical calculation of the distance dependence of the local (virtual bonds, angles, and torsions) and the nonbonded interactions, enabling straightforward application in elastic network normal mode calculation. To examine whether the protein vibrational motions simulated by atomistic MD are reproduced by the REACH method the mean-square fluctuation and vibrational density of states are calculated and compared.

Next, anticipating that a major use of coarse-grained model will be in the dynamics of protein complexes, a myoglobin dimer is examined using the REACH model. Force constants and their model functions in dimeric myoglobin are calculated and compared with the monomer results. The mean-square fluctuations of the intramolecular and intermolecular motions are separately determined so as to characterize protein:protein interactions.

The temperature dependence of protein dynamics has been extensively studied using both experiments and computer simulations. The dynamical “glass” transition phenomenon in protein fluctuations at *T*_{g} ~ 180−220 K (34–39) has been characterized in terms of softening of the effective potential wells (40). As the force constant determines the curvature of the potential energy surface, analyzing force constants from MD at different temperatures provides a direct method for examining temperature-dependent protein dynamics. Here, as an application of the methodology, the REACH force constants are calculated from myoglobin MD trajectories at various temperatures and compared. The results shed light on the length-scales important in determining protein dynamical transition phenomena.

The REACH method introduced here is a true multiscale method in that detailed information from atomic scale MD is mapped onto a coarse-grained model of macromolecule systems, allowing, in principle, extension of length scales examined by an order of magnitude.

## THEORY AND METHODS

### Dynamical simulations

Molecular dynamics simulations of myoglobin were performed. The simulation details of the monomer are described in the literature (41,42). Briefly, the simulations were performed at constant temperature and pressure (1 atm) conditions (the NPT ensemble) in an explicit water box using the CHARMM program (43) and CHARMM 22 all-atom potential function (44). Production runs were performed for 10 ns for the monomer and dimer simulations at 300 K. Shorter runs, of 1 ns, were also performed for the monomer at 12 temperatures ranging from 150 K to 280 K. The atomic coordinates were saved every 50 fs for analysis.

The dimer MD simulations were carried out using starting coordinate from Protein Data Bank (PDB) (45) entry 1A6G (46). A rectangular primary simulation box was built of dimensions 100 Å × 75 Å × 60 Å. To explicitly solvate the protein 12,902 TIP3P water molecules (47) and 26 chloride counterions were placed in the box, leading to an electrically neutral system of 43,780 atoms. The simulations were performed using the same potential function and simulation protocol conditions for monomeric myoglobin. Production runs were performed for 5 ns and the atomic coordinates were saved every 50 fs for analysis.

### REACH: Calculation of force constants

We now summarize the basic concept of the REACH model, together with the method for calculating the corresponding model parameters from MD trajectories. The coarse-grained model was built using only C_{α} atom coordinates. The mass of each pseudo-atom was defined as the sum of atomic masses comprising the corresponding residue. Heteroatoms such as ions and the heme group in myoglobin were not included.

Given the coordinates of the residues in the system, the potential energy is (23)

where *N* is the number of residues, *d*_{ij} () is the distance between the dynamical (equilibrium) coordinates of C_{α} atoms *i* and *j*, and *k*_{ij} is the force constant for the spring between *i* and *j*.

Given the displacement vector in the 3*N* dimensional representation, i.e., Eq. 1 becomes

where the force constant (Hessian) matrix **K** is

where and *s*,*t*{*x*, *y*, *z*}. The force constant between *i* and *j* is then derived as

Making the harmonic approximation under the equilibrium condition at constant temperature, *T*, allows the Hessian matrix to be calculated from the variance-covariance matrix, (48), as

where *k*_{B} is the Boltzmann constant. The value *k*_{ij} can then be calculated using Eqs. 3 and 4 as

We note again that this derivation is exact only in the harmonic approximation.

Calculation of the Hessian matrix directly from **C** (in Eq. 4) leads to numerical errors arising from the calculation of the inverse matrix. To overcome this difficulty **C** was diagonalized and then **K** calculated from the eigenvalues and eigenvectors of **C**. **C** is diagonalized as

where is the associated eigenvector matrix and Δ_{ij} = (*λ*_{i}*δ*_{ij}) is the eigenvalue matrix.

**K** is then derived as follows:

Inspection of Eq. 7 shows that numerical error can originate from the six modes (translation and rotation modes) with eigenvalues numerically close to zero (in such cases becomes infinitely large). These modes were thus omitted from the calculation of **K**.

In Eq. 1 *k*_{ij} of any pair of atoms is assumed to be positive and thus the off-diagonal Hessian (in Eq. 3) to be negative. However, in general, some off-diagonal components in a Hessian matrix may be positive even if the potential energy is positive-definite harmonic. For example, the Hessian matrix, has semi-positive (nonnegative) eigenvalues of and Thus, the assumption in Eq. 1 is stronger than the standard harmonic approximation, as some off-diagonal elements of **H** calculated from the MD-derived **C** may be positive.

The anharmonicity in atomistic MD simulations may lead to the force constants deviating from the “ideal” harmonic approximation quantities, but the contribution will likely be small, i.e., this is not the main origin of negative force constants. Inspection of the distribution of *k*(*r*) in Results (Figs. 1 *a* and 3 *a*) suggests that the errors can be assumed to be random, i.e., the MD-derived force constant, *k*_{MD}, can be written as

where *k*_{err} is assumed to be random. The results also show that most *k*_{MD} are positive and distribute around the average, *k*, with a width of |*k*_{err}| when the pairwise distance, *r*, is small. *k*_{MD} are close to zero at large *r*, and therefore, the pairs are then considered to have no interaction.

*a*) REACH force constants,

*k*, of virtual (

*a*1) 1-2, (

*a*2) 1-3, (

*a*3) 1-4, and (

*a*4) nonbonded interactions in the myoglobin monomer as a function of pairwise distance,

*r*. In the inset of

*a*4 the averages of

*k*within bins of 1 Å width are shown together

**...**

*a*) REACH force constants,

*k*, of virtual (

*a*1) 1-2, (

*a*2) 1-3, (

*a*3) 1-4 interactions and (

*a*4) nonbonded intramolecular and (

*a*5) intermolecular interactions in myoglobin dimer are shown as a function of pairwise distance,

*r*. In

*a*6, the averages of

*k*(intra,

**...**

**K,** derived from the MD trajectories via **C,** is the effective force constant representing the interaction between any given residue and the environment, the latter including both the other implicit atoms and the surrounding water molecules. In an explicit representation including both the degrees of freedom of the subsystem (e.g., C_{α} atoms), **r**_{s}, and the environment system, **r**_{e}, the effective force constant is derived as follows.

The potential energy in Eq. 2 can be decomposed as (47)

When *r*_{e} is projected onto *r*_{s} using the condition i.e., the effective potential energy in Eq. 2 becomes

In the force constant derived from the atomic-resolution MD trajectories with explicit water molecules, the solvent contributions can thus be included according to the form of Eq. 10, i.e., using the protein-water (**K**_{se}) and water-water (**K**_{ee}) interactions.

In this study each 1-ns MD trajectory was used to calculate the variance-covariance matrix: i.e., the full trajectories were separated into 1-ns trajectories, allowing calculating of **C** from each of the 1-ns trajectories and averaging of all the matrices derived. The time length of 1 ns is long enough to characterize the vibrational component of protein fluctuation, which arises from the harmonic potential, but is not so long that the intramolecular contribution is small compared to the slow, diffusive motion. Since normal modes represent vibrational motions on an effective harmonic potential, and since the lowest frequency of the atomistic normal modes is 3.84 cm^{−1} (the period is ~10 ps; results not shown), the MD time length of 1 ns is a suitable choice. From **C**, the Hessian matrices were separately calculated using Eq. 4 and averaged. This procedure leads to an accurate estimation of **H**, particularly the off-diagonal components corresponding to the correlation of two residues. The force constants were then calculated by combining with Eqs. 5–7.

To separate the internal and external motions, the best fit to a reference protein structure was used to eliminate the translational and rotational motions. If the whole protein structure is used as the reference structure, a segment, e.g., one *α*-helix, may undergo external motion. Such external motion leads to errors in estimating covariances of atom pairs within the segment and thus errors in the corresponding force constants. To overcome this difficulty the system was separated into segments of 20 residues (test calculations using 10 ~ 30 residues led to only small differences (~2%) in the force constants derived; results not shown) and the submatrices of **C** were individually calculated by best-fitting each segment, thus allowing the local interactions within the segment, such as the virtual 1–2 (between residues *i* and *i*+1), 1–3 (between residues *i* and *i*+2), and 1–4 (between residues *i* and *i*+3) force constants, to be obtained.

Model parameter functions are useful for obtaining a simplified understanding of protein dynamics and for convenient application in coarse-grained MD simulation. Their functional forms were constructed here for each interaction type, depending on the shape of distributions. The parameters were calculated by fitting the distributions to the model functions.

### Calculation of mean-square fluctuation from normal modes and the contribution of intermolecular motions

From the set of eigenvalues, {*λ*_{i}}, and eigenvectors, {**v**_{i}}, the mean-square fluctuation of residue *n*, was calculated at temperature, *T*, as

where *v*_{n,i} is the displacement of residue *n* in mode *i*. To calculate the contribution to from intermolecular motions the in Eq. 11 were replaced by such that the intramolecular contributions are subtracted as

where **v**_{1,j} (**v**_{2,j}) is the normal-mode eigenvector calculated from the subsystem of the single monomer 1 (2). Note that these monomer eigenvectors (the number of residues in the monomer is *N*_{mono}) are given the full dimensions corresponding to the number of degrees of freedom of the dimer (i.e., 6*N*_{mono}) by adding 0 to the elements corresponding to the other monomer (i.e., the number of 0 is 3*N*_{mono}).

## RESULTS

### Myoglobin monomer

In an initial set of calculations the variance-covariance matrices were calculated from the MD trajectories of monomeric myoglobin and subsequently the force constants derived using Eqs. 4–7. Analytical model parameter functions were fitted to these force constants and the resulting REACH model used for calculating normal modes. The mean-square fluctuations and vibrational densities of states thus obtained are compared with those from the MD so as to examine whether the reduced normal-mode model can reproduce these aspects of protein internal dynamics obtained from a full all-atom simulation.

In Fig. 1 *a* the force constants between the protein residue pairs are shown as a function of the pairwise distance, *r*. As expected, the distribution of the virtual 1-2 bond interaction force constant, *k*_{12}, is quantitatively different from the others: *k*_{12} is much larger and has a much narrower distribution width, centered at *r* ~ 3.82 Å, the typical nearest-neighbor C_{α} distance. The values *k*_{13} and *k*_{14} also have relatively narrow distribution widths, centered at *r* ~ 5.5 and 5.0 Å, respectively, which are again characteristic lengths for these local interactions, especially within the *α*-helices dominating the myoglobin structure. In contrast, the nonbonded interaction, *k*_{nb}, has a broader distribution at *r* > 4 Å. At *r* > 10 Å, *k*_{nb} are close to zero, meaning that the corresponding residues exert nearly no net forces on each other, and validating the conventional “one force-constant parameter model within a cutoff distance” (23–26) and a decaying force constant parameter function used empirically in the coarse-grained MD (27,50).

Model parameter functions were constructed by fitting to the distributions in Fig. 1 *a*. A constant value model was assumed for each of the virtual bond (1-2, 1-3, and 1-4) interactions (*k*_{12}, *k*_{13}, *k*_{14}) and a single exponential decay [*k*_{nb}(*r*) = *a* exp(−*br*)] for the nonbonded interaction. A nonlinear fitting procedure (51) was applied to the exponential model so as to avoid the overweighting of relatively smaller values (i.e., at larger *r*). The resulting values and standard-deviation errors are given in Table 1.

The functional forms above resemble those of another force field recently derived for coarse-grained MD simulation (51). However, this method is improved in three respects: First, in Trylska et al. (50), the pair distribution function, *g*(*r*), was calculated for atom pairs using conformational ensembles in the PDB and then the effective potential energy derived using Boltzmann inversion, i.e., *v*(*r*) = −*k*_{B}*T* ln *g*(*r*). The effective potential energy derived was then decomposed into several harmonic/Morse potential functions corresponding to local (1-2, 1-3, and so on) and nonlocal interactions. The associated force constants were derived from the curvatures of the potential functions. Therefore, in this method the averaged distribution for all pairs of each interaction type (e.g., 1-2, 1-3 interactions) defines one equilibrium distance, in contrast to Eq. 1, in which each interaction has a distinct In principle, under the potential energy of Eq. 1, averaging distributions with different leads to broadening of the potential energy wells and thus error in *k*. Second, the force parameters derived by the Boltzmann inversion were again iteratively modified so that test MD simulations of a coarse-grained model reproduce the all-atom MD fluctuations. This iterative procedure may preclude a direct mapping of atomistic MD results onto coarse-grained model and may include system dependence. Third, in Trylska et al. (50), only pairwise distances are considered and thus correlations through intermediate residues are not well taken into account (i.e., the correlations between residue *i* and *k* and between *j* and *k* lead to a correlation between *i* and *j*, even if *i* and *j* are not directly correlated). In contrast, the full-dimensional matrix representations of **C** and **H** in this study eliminate such indirect correlations which may cause errors in force-constant estimation.

In Fig. 1 *b*, the mean-square fluctuation, *x*^{2}, derived from REACH using the model force constants, are compared with those from crystallographic B-factors and from the MD trajectories. The MD and REACH fluctuations are similar in average magnitude to each other and to the B-factor amplitudes. Furthermore, in all three plots the regions of the protein with higher or lower fluctuations are similar. However, the correlation coefficients of the B-factor with both computational models are low (0.443 and 0.479 for the MD and REACH methods, respectively). This may be due in part to differences in the crystallographic and simulation environments (e.g., crystal packing lowers the fluctuation, especially at N- and C-termini), to the limited timescale probed in this dynamical model, and to experimental errors. The agreement between the MD and REACH *x*^{2} is much better (correlation coefficient: 0.786), showing that REACH using the MD-derived model residue force constants reproduces well *x*^{2} derived directly from the all-atom MD trajectories.

The results using the REACH method can be compared with the results of previous simplified normal-mode methods. In the simplest version, using one parameter with a cutoff length (18–27,29,30), the force constants of pairs with *r* < *r*_{c} are assumed to be a constant, *k*_{const}. *k*_{const} is adjusted a posteriori so as to fit the average magnitude of experimental residue fluctuations. Therefore, the method in Tirion (23) can predict only ratios of residue fluctuations (which is nevertheless useful for determining, for example, dynamic domains (52) and separating static from flexible parts of protein molecule). In contrast, this method can predict the magnitudes of fluctuations using the model force constants determined from MD trajectories. Furthermore, this method has a significantly larger correlation coefficient with MD fluctuations than the method in Tirion (23) (which was 0.560 with *r*_{c} = 12 Å), due partly to the separation of the force constants into the local and nonbonded interactions (49).

Finally, the frequencies derived from the REACH calculation are compared with those from the atomistic MD. For this purpose, the vibrational spectra and the velocity autocorrelation functions were calculated and compared. The vibrational density of states along each REACH eigenvector was calculated from the MD trajectories and compared with the REACH frequency. To do this, 2^{14} MD trajectory frames, separated by 50 fs (total length of ~0.82 ns), were used for calculating the velocity autocorrelation function and its Fourier transform so as to derive the vibrational density of states, *g*(*ω*). In Fig. 2 *a* are shown the results of selected normal modes: 1, 100, 400. Modes 1 and 100 possess broad peaks. The positions of the MD vibrational peaks are at somewhat higher frequency than the corresponding REACH frequency, but, given the simplicity of the method, the agreement is satisfactory. In contrast, the high-frequency mode (mode 400) has a much broader peak at 200−300 cm^{−1} in the REACH vibrational spectrum, due to the expected poor representation of the localized, high-frequency modes by the coarse-grained model.

*a*) Vibrational density of states from MD trajectory of myoglobin monomer along selected normal modes, (

*a*1) mode 1; (

*a*2) mode 100; and (

*a*3) mode 400. The associated REACH normal-mode frequencies are also shown. In panel

*b*, MD-derived frequencies are plotted

**...**

For the purpose of comparing the frequencies of all the MD and REACH modes, the velocity autocorrelation functions along each REACH mode was calculated from the MD trajectories and fitted to a model function, allowing the characteristic MD frequency to be derived. To do this, a model using the Langevin equation was applied to obtain the mode frequency and friction (39). Fig. 2 *b* shows the comparison between the REACH normal-mode frequency, *ω*_{REACH}, and the MD-derived frequency, *ω*_{MD}. As seen in Fig. 2 *a* for modes <100 cm^{−1}, *ω*_{MD} is slightly higher than *ω*_{REACH} but there is a good correlation, i.e., the modes with larger *ω*_{MD} have also larger *ω*_{REACH}. In contrast, the correlation is low for *ω* >100 cm^{−1}. Thus, the simplified normal mode method reproduces well the frequencies of the larger amplitude, lower-frequency motions (<100 cm^{−1}) simulated by atomistic MD.

### Myoglobin dimer: intra- and intermolecular motions

The force constants for dimeric myoglobin were calculated from the corresponding MD trajectories and the associated model functions of the intra- and intermolecular force constants were separately derived. Using the REACH normal modes calculated from the derived model function force constants, the mean-square fluctuations were calculated and decomposed into intra- and intermolecular components.

In Fig. 3 *a*, *k* between the pairs of protein residues is plotted against *r*. The distributions for the virtual 1-2, 1-3, 1-4 bonds and intramolecular interactions are close to those found for the myoglobin monomer (see Fig. 1), although the dimer distributions are wider, due probably to the shorter MD sampling length of 5 ns. This similarity suggests that the intramolecular force constants derived from MD simulation using atomistic force fields are independent of protein association state. The distribution of intermolecular interaction force constants in Fig. 3 *a*5 appears at first glance to be random, due to the relatively low correlation of the C_{α} atom pairs at the nearest distance of *r* ~ 5.5 Å. However, the averages at *r* = 5.5−10 Å lead to statistically meaningful, nonzero values, which were used for fitting (see Fig. 3 *a*6).

Associated model functions were constructed by fitting the model functions used in the monomer calculation to the corresponding distributions. The intra- and intermolecular force constants were both modeled in single exponential form, but with different fitting parameters, i.e., *k*_{intra}(*r*) = *a*_{intra} exp(−*b*_{intra}*r*) and *k*_{inter}(*r*) = *a*_{inter} exp(−*b*_{inter}*r*). The results, together with the standard-deviation errors, are given in Table 1. The model parameters of the 1-2, 1-3, 1-4, and intramolecular interactions in the myoglobin dimer are found to be also similar to those of monomer, as are their distributions. The intermolecular force constants are slightly smaller than the corresponding intramolecular values (see Fig. 3 *a*6).

Using the derived model parameter functions the REACH normal modes of the myoglobin dimer were calculated, and from these, *x*^{2}. Fig. 3 *b*1 shows an excellent correspondence between REACH and MD, with a high correlation coefficient of 0.835. The average *x*^{2} magnitude of REACH is slightly smaller (average over residues is 1.26 Å^{2}) than that from the MD (1.46 Å^{2}), due partially again to the fact that the 1-ns MD dynamics of the fully solvated myoglobin dimer includes diffusive motions absent from the effective harmonic potential approximated by NMA.

The dimer motion can be separated into the internal motions within two monomers (intramolecular motions) and the external motions (translation and rotation) between two monomers (intermolecular motions). In addition to the overall motions, *x*^{2} from the intramolecular motions calculated using Eqs. 11 and 12 are also in good agreement (0.630 Å^{2} from REACH and 0.581 Å^{2} from MD). In Fig. 3 *b*2, the REACH *x*^{2} contribution from the intramolecular and intermolecular motion, and are shown. The presence of variation with residue number indicates that not only translation (fluctuations constant with residue number) but also whole-molecule rotation contribute to the intermolecular rigid-body motion. This correlates with the overall mean-square fluctuations, indicating, as expected, that the residues far from the molecular center (i.e., near the surface) undergo both larger internal fluctuation and larger rigid-body rotation.

### Temperature dependence of the force constants in myoglobin monomer

In this section, as an application of this REACH method, the force constants and associated model functions are calculated from the atomistic MD trajectories of the myoglobin monomer at various temperatures, and the temperature dependence of the REACH model parameters analyzed. The atomistic MD force field used in this study has no temperature dependence, i.e., the same all-atom force field was applied to the MD simulations at all temperatures. In principle, temperature-dependent atomistic force fields could be used to parameterize the REACH model, which would then be expected to produce a more accurate description of the temperature-dependent dynamics.

In Fig. 4, *a*–*d*, the average force constants for the 1-2, 1-3, 1-4, and nonbonded interactions are plotted as a function of temperature. The 1-2 and 1-3 interactions undergo transitions at *T*_{g} ~170 K, respectively (*k*_{12}(*T*) = 847 for *T* < *T*_{g} and −0.955 *T* + 1020 for *T* > *T*_{g}, and *k*_{13}(*T*) = 29.0 for *T* < *T*_{g} and −0.155*T* + 55.3 for *T* > *T*_{g}, in units of kJ/Å^{2}). Below *T*_{g}, these interactions are constant, consistent with harmonic dynamics. Above *T*_{g}, *k* decreases linearly, indicating softening of the effective harmonic potential curvature (40). In contrast, the 1-4 and nonbonded interactions decrease linearly without a transition (*k*_{14}(*T*) = −0.0541*T* + 43.7 kJ/Å^{2}).

*k*, in myoglobin monomer as a function of temperature for virtual (

*a*) 1-2, (

*b*) 1-3, (

*c*) 1-4, and (

*d*) nonbonded interactions. In case of the nonbonded interactions the force constants at

*r*< 10 Å were averaged

**...**

In Fig. 4 *e*, the exponent, *b*, in the nonbonded force constant model is shown as a function of temperature. 1/*b* = *l* is the correlation length over which the force constant decays to 1/*e. b* is found to increase linearly with *T* (*b*(T) = −0.0011*T* + 0.41 kJ/Å^{2}), i.e., *l* decreases with temperature, implying increased random, short-range motion at higher temperatures. This randomness can originate from, for example, Langevin-type friction (41,42).

The mean-square fluctuations, *x*^{2}, were calculated from the REACH normal modes in combination with the temperature-dependent model force constants derived above. As *a*, the coefficient in the nonbonded function, has little temperature dependence (data not shown); the average of *a* over all the temperatures was used as a constant, i.e., *a*(*T*) = *a* = 1040 kJ/Å^{2}. Fig. 4 *f* plots *x*^{2} versus *T* and again shows good agreement between the REACH and MD results. The transition at ~170 K is thus reproduced using reduced REACH normal modes with model force constants that have very simple temperature dependences.

Finally, the question arises as to the effects on *x*^{2} of the temperature-dependent transitions in *k*_{12} and *k*_{13} seen in Fig. 4, *a* and *b*. To determine this, *x*^{2} was calculated from the REACH using the temperature-dependent *k*_{12} and *k*_{13} but imposing temperature-independent *k*_{14} and *k*_{nb}, fixed at their values at 150 K, i.e., *k*_{14}(*T*) = *k*_{14}(150 K) and *k*_{nb}(*T*) = *k*_{nb}(150 K). The results in Fig. 4 *f* show that, although the 1-2 and 1-3 interactions exhibit temperature-dependent force constant transitions, these interactions contribute little to the overall *x*^{2} transition. Rather, the transition in *x*^{2} arises from the 1-4 and nonbonded interactions, the force constants of which decrease linearly with temperature such that the associated fluctuations become increasingly dominant at *T* > 170 K.

## CONCLUDING REMARKS

In this article, a novel method, REACH, is presented for calculating effective coarse-grained dynamical modes in protein systems. The REACH elastic network model uses the variance-covariance matrix obtained from atomistic MD trajectories to derive the required residue interaction force constants. The force constants plotted as a function of pairwise distance present distinct distributions for the 1-2, 1-3, 1-4, and nonbonded interactions. Fitted analytical model functions further simplify the force field required for calculating REACH normal modes. The simplified potential function and force-field model reproduces well the mean-square fluctuations from atomic resolution MD, including both intramolecular motions of the myoglobin monomer and intermolecular motions in the dimer system. The distributions and model functions of the force constants are closely similar in monomeric and dimeric myoglobin.

The internal protein dynamics can be conveniently characterized using the force constants derived. For example, a temperature dependence is seen in the average virtual 1-2 and 1-3 bond force constants. The interactions undergo a transition: at *T* < *T*_{g} they are constant, corresponding to harmonic potentials, whereas at *T* > *T*_{g} they decrease linearly, indicating softening of the effective harmonic potentials and the resulting increase of internal fluctuation. In the 1-4 and nonbonded interactions there are no such transitions observed. The dynamical transition in the mean-square fluctuation, *x*^{2}, which was observed in inelastic neutron scattering experiments (34,38,53), is reproduced using these simple temperature-dependent model force constants obtained in this study. However, the local 1-2 and 1-3 interactions contribute little to the *x*^{2} transition. Rather, the linear decrease of the nonlocal interactions leads to an approximately quadratic increase in *x*^{2} by softening the elasticity in the global protein modes.

Comparison of the REACH normal-mode frequencies with the corresponding MD-derived vibrational spectra shows that the normal modes at *ω* < 100 cm^{−1} have slightly lower frequencies than the MD, but that the correspondence is satisfactory given the simplicity of the model. For *ω* > 100 cm^{−1}, the REACH modes are inaccurate as expected, due to the coarse-grained nature of the model. This threshold frequency (~100 cm^{−1}) is reasonable because the frequency of the vibration of adjacent residue pairs, which is the highest frequency in residue-scale model, is *ω* = ~140 cm^{−1} when *k*_{12} = 735 [kJ/mol Å^{2}] and the average residual mass for myoglobin *m* = 106 a.u. are used.

The mass definition of the coarse-grained residue determines the timescale of the vibrational motion. Here, the mass was defined as that of the sum of the atoms in the residue. A slight underestimation of the vibrational frequencies resulted, which may imply overestimation of the mass, although the increased roughness of the PES in atomic simulation may also contribute somewhat to the observed frequency shift. A more refined determination of the mass and iterative estimations of the force constant model functions would improve the amplitudes and frequencies of vibrational motions calculated with REACH. Furthermore, in coarse-grained systems, the contribution of the degrees of freedom other than C_{α} atoms can be represented dynamically in the form of a (generalized) Langevin equation (54), as well as in the potential energy perturbation, i.e., the effective Hessian discussed in Eq. 10.

This method has similarities with an approach presented in Chu and Voth (32) in which the averaged values and fluctuations of the effective internal coordinates were used to define a CG model matched to all-atom MD. As with REACH, this method is self-consistently multiscale in that only information from MD is used in the derivation. This is in contrast to other methods that use experimental data (e.g., x-ray B factors) (23–27). A difference, however, is that the REACH method derives residue-interaction force constants directly from the MD covariance matrix. Thus, REACH directly analytically maps the atomistic MD onto the CG model without any iterative steps, whereas in Chu and Voth (32) iterations are performed to match the required fluctuations. Therefore, arguably, the REACH method is more direct. However, as REACH does not involve iterative fitting it would not be expected to yield as close agreement with the MD fluctuations.

The REACH method requires only fitted analytical functions for the distance dependence of the various force constants used. These can be obtained from relatively short MD simulations of a small number of systems. MD studies have indicated that the atomic-detail covariance matrix does not converge in ~10 ns (55,56). However, a recent study demonstrated that essential conformational subspace from principal component analysis converges in a few nanoseconds and it reproduces well the space explored by a protein over ~200 ns timescale (57), suggesting that the force constants calculated from nanosecond MD simulations may be useful in describing protein dynamics on ~100 ns timescale.

Iterative procedures used in previous ENM studies are not necessary, allowing a direct mapping of atomistic MD results onto the coarse-grained model. The use of fluctuation covariance for calculating the force constants enables the equilibrium distance to be introduced into pair interactions, and avoids indirect pair correlations and force-constants error estimation when calculated using the pair distribution function.

The simple, REACH concept of calculating force-constant parameters using MD trajectories enables convenient implementation in MD simulation packages: this work is now in progress. Future work will also examine possible secondary and tertiary structure dependence of the REACH force constants before application to the study of large biological systems.

## Acknowledgments

Simulations were performed on the HELICS supercomputer at the Interdisciplinary Center for Scientific Computing (IWR) in University of Heidelberg.

We acknowledge funds from the Volkswagen Stiftung (grant No. I/80 437) and the European Union (grant No. NEST 012835 EMBIO).

## Notes

Editor: Gregory A. Voth.

## References

**The Biophysical Society**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (225K)

- REACH coarse-grained biomolecular simulation: transferability between different protein structural classes.[Biophys J. 2008]
*Moritsugu K, Smith JC.**Biophys J. 2008 Aug; 95(4):1639-48. Epub 2008 May 9.* - REACH coarse-grained simulation of a cellulose fiber.[Biomacromolecules. 2012]
*Glass DC, Moritsugu K, Cheng X, Smith JC.**Biomacromolecules. 2012 Sep 10; 13(9):2634-44. Epub 2012 Aug 31.* - REACH coarse-grained normal mode analysis of protein dimer interaction dynamics.[Biophys J. 2009]
*Moritsugu K, Kurkal-Siebert V, Smith JC.**Biophys J. 2009 Aug 19; 97(4):1158-67.* - On developing coarse-grained models for biomolecular simulation: a review.[Phys Chem Chem Phys. 2012]
*Riniker S, Allison JR, van Gunsteren WF.**Phys Chem Chem Phys. 2012 Sep 28; 14(36):12423-30. Epub 2012 Jun 8.* - Multiscale methods for macromolecular simulations.[Curr Opin Struct Biol. 2008]
*Sherwood P, Brooks BR, Sansom MS.**Curr Opin Struct Biol. 2008 Oct; 18(5):630-40. Epub 2008 Sep 17.*

- Effective Harmonic Potentials: Insights into the Internal Cooperativity and Sequence-Specificity of Protein Dynamics[PLoS Computational Biology. 2013]
*Dehouck Y, Mikhailov AS.**PLoS Computational Biology. 2013 Aug; 9(8)e1003209* - PIM: Phase Integrated Method for Normal Mode Analysis of Biomolecules in Crystalline Environment[Journal of molecular biology. 2013]
*Lu M, Ma J.**Journal of molecular biology. 2013 Mar 25; 425(6)1082-1098* - Normal Mode Analysis with Molecular Geometry Restraints: Bridging Molecular Mechanics and Elastic Models[Archives of biochemistry and biophysics. 20...]
*Lu M, Ma J.**Archives of biochemistry and biophysics. 2011 Apr 1; 508(1)64-71* - Elastic Network Models are Robust to Variations in Formalism[Journal of chemical theory and computation....]
*Leioatts N, Romo TD, Grossfield A.**Journal of chemical theory and computation. 2012 Jul 10; 8(7)2424-2434* - Are different stoichiometries feasible for complexes between lymphotoxin-alpha and tumor necrosis factor receptor 1?[BMC Structural Biology. ]
*Mascarenhas NM, Kästner J.**BMC Structural Biology. 128*

- Coarse-Grained Biomolecular Simulation with REACH: Realistic Extension Algorithm...Coarse-Grained Biomolecular Simulation with REACH: Realistic Extension Algorithm via Covariance HessianBiophysical Journal. Nov 15, 2007; 93(10)3460PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...