Logo of narLink to Publisher's site
Nucleic Acids Res. 2006 Oct; 34(17): 4912–4924.
Published online 2006 Aug 18. doi:  10.1093/nar/gkl472
PMCID: PMC1635246

A set of nearest neighbor parameters for predicting the enthalpy change of RNA secondary structure formation


A complete set of nearest neighbor parameters to predict the enthalpy change of RNA secondary structure formation was derived. These parameters can be used with available free energy nearest neighbor parameters to extend the secondary structure prediction of RNA sequences to temperatures other than 37°C. The parameters were tested by predicting the secondary structures of sequences with known secondary structure that are from organisms with known optimal growth temperatures. Compared with the previous set of enthalpy nearest neighbor parameters, the sensitivity of base pair prediction improved from 65.2 to 68.9% at optimal growth temperatures ranging from 10 to 60°C. Base pair probabilities were predicted with a partition function and the positive predictive value of structure prediction is 90.4% when considering the base pairs in the lowest free energy structure with pairing probability of 0.99 or above. Moreover, a strong correlation is found between the predicted melting temperatures of RNA sequences and the optimal growth temperatures of the host organism. This indicates that organisms that live at higher temperatures have evolved RNA sequences with higher melting temperatures.


RNA is more than a simple single-stranded sequence carrying genetic information as in the Central Dogma of Biology. For example, it can form tertiary structures that, such as proteins, can be catalytic. Natural and engineered RNA molecules are widely used as functional tools in enzymatic catalysis and genetic control (15). One current problem is how to predict the structures of functional RNA sequences.

Secondary structure, the sum of canonical base pairs, is stronger (69) and forms faster (10) than tertiary structure. Therefore, secondary structure can largely be determined without knowledge of tertiary structure. Comparative sequence analysis is a standard technique for determining the secondary structure of homologous RNA sequences (1113). When only a few or even a single sequence is available, the secondary structure at 37°C can be predicted by free energy minimization algorithms (1417) using a set of empirical free energy parameters, determined from optical melting experiments (1721). Each parameter only depends on the sequence identity of nucleotides in the motif and in adjacent base pairs and the total free energy is the sum of nearest neighbor terms. The average sensitivity (the percentage of known base pairs that are correctly predicted) of free energy minimization prediction has been benchmarked as high as 72.8 ± 9.4% for a diverse database of sequences having fewer than 800 nt (17). Furthermore, experimentally determined constraints can improve this accuracy of prediction up to 84% (17,18) for sequences with <6% pseudoknotted (non-nested) base pairs (17). Partition function prediction of base pair probabilities can be used to identify base pairs in the predicted lowest free energy structure that are much more likely than average to be in the known secondary structure (22,23). For example, 91.0% of base pairs in the lowest free energy structure with pairing probability of 0.99 or higher are contained in the known structure, on average (22). The high accuracy of thermodynamic structure prediction (17) demonstrates that many RNA secondary structures can be determined from sequences, without knowledge of any tertiary contacts or protein interactions.

The current set of free energy nearest neighbor parameters for predicting the free energy of RNA secondary structure, however, is limited to application at 37°C. Many organisms, thermophiles and psychrophiles, live at temperatures far from 37°C and many experiments are conducted at other temperatures. The prediction of secondary structure of RNA at arbitrary temperature would expand our knowledge of structure and evolution in the RNA world. Moreover, it would facilitate studying and designing functional RNA molecules at temperatures other than 37°C. The enthalpy nearest neighbor parameters can be used in conjunction with available free energy nearest neighbor parameters for 37°C to determine free energy nearest neighbors at other temperatures. But the most recent enthalpy parameters were derived in 1995 using a simple model (24). At that time, no themes had emerged for the sequence-dependent stability of internal loops. Subsequently, the nearest neighbor model for free energy change at 37°C was significantly improved (17) using experimental results. Therefore, we applied the principles of the current free energy nearest neighbor model (17,18) to determine a complete set of enthalpy nearest neighbor parameters using the available optical melting data.


Database of experiments

The database of experimental data for derivation of enthalpy parameters is included in Supplementary Data. It includes 130 hairpin loops (2531), 37 bulge loops (32,33), 337 internal loops (17,18,3449) (99 of which are 2 × 2 internal loops), 74 multibranch loops (50,51) and 43 coaxial stacking models (5255).

Derivation and refinement of enthalpy parameters

Canonical base pairs

The enthalpies of Watson–Crick and GU base pairs were derived by Xia et al. (21) and Mathews et al. (18), respectively.

Dangling ends and terminal mismatches

Dangling ends are unpaired nucleotides adjacent to canonical pairs and their enthalpy parameters were compiled previously (24). Dangling ends on terminal GU pairs are treated similar to dangling ends on terminal AU pairs. Terminal mismatches are non-canonical pairs at the end of helixes. The enthalpy parameters of terminal mismatches are taken from another compilation (20), with the exception of mismatches on terminal GU pairs, which were measured recently (30).

If a terminal mismatch has the potential to pair canonically, the values of A–C and C–A mismatches are used for the purine–pyrimidine mismatch and pyrimidine–purine mismatches, respectively. This is important for partition function calculations, where all possible secondary structures are considered.

Hairpin loops

The experimental enthalpies of hairpin loop formation are calculated from published experimental data (2531) with the following equation:


where ΔHstemloopo is the experimental value for unfolding the hairpin loop with stem, ΔHstemo is calculated by the INN-HB parameters (18,21), without an intermolecular initiation term.

The hairpin loop enthalpy parameters are estimated by linear regression using the same model as free energy nearest neighbor parameters (17), except that the GG first mismatch bonus observed for free energy does not apply for enthalpy because the bonus was not statistically significant for enthalpy. The GG stability bonus is therefore entropic in nature, consistent with the observation that GG mismatches are dynamic (56), i.e. they sample more than one single microstate on short timescales.

The enthalpies of hairpin loops are estimated by the following equation:

ΔHloopo(n>3)=ΔHinitiationo(n)+ΔHo(first mismatch stacking)+ΔHbonuso(UU or GA first mismatch but not AG)+ΔHbonuso(special G-U closure)+ΔHpenaltyo(oligo-C loops),

where n is the number of unpaired nucleotides in the loop. Hairpins with fewer than 3 unpaired nucleotides are not allowed by the model. When n = 3, only the initiation term is considered without any bonus and penalty terms, except a penalty for hairpin loops with three Cs. When n > 3, the special GU closure bonus applies to GU closed hairpins in which a 5′ closing G is preceded by two G residues; and ΔHbonuso (UU or GA first mismatch but not AG) is applied to loops with first mismatches of UU or GA (G on the 5′ side and A on 3′ side of loop). The oligo-C penalty applies only to loops composed of all C residues and, if n > 3, is calculated with ΔHpenaltyo (oligo-C loops, n > 3) = An + B. For hairpinloops composed entirely of 3 C residues, the ΔHpenaltyo (oligo-C loops, n = 3) is applied.

The enthalpy parameters are listed in Table 1 and the database of measured loop enthalpies is available as Supplementary Data. In the absence of data, for hairpin loops longer than 9 nt, the initiation enthalpy is approximated with the initiation term for a hairpin of 9 nt. This assumes that additional instability of hairpin loops as the loop lengthens derives from the entropy (57).

Table 1
Hairpin loop enthalpy parametersa

The measured free energies at 37°C of some special hairpin loops of 3, 4 or 6 unpaired nucleotides (30,31,3436) are either more or less stable by 0.9 kcal/mol than the model predicts. The enthalpies for each of these sequences are listed in a separate lookup table (Table 2), to be consistent with the free energy parameters.

Table 2
Lookup table for unstable triloops and stable tetraloops and hexaloops

Bulge loops

RNA secondary structure is destabilized by bulge loops, which are an interruption of helical structure in one strand only (32,37,38). The initiation terms, ΔHbulge initiation°(n) for bulge loops of 1–3 nt, are listed in Table 3. They are the average values of experimental data (32,33), calculated using the following equation:

ΔHbulge initiationo=ΔHo(duplex with bulge)ΔHo(duplex without bulge)+ΔHbp stacko(n>1),

where the enthalpy of the duplex without bulge is the experimental value of the sequence of the duplex without the bulge or as calculated with INN-HB parameters (21) if the experimental values were not available. ΔHbp stack° is the stacking enthalpy of the base pairs in the duplex without the bulge that flank the bulge loop in the duplex with the bulge. Because the difference of initiation enthalpies between 2 and 3 nt bulges is almost zero, it is assumed that the increasing instability for longer bulges (n ≥ 4) comes from the entropy of the loop closure (39,57). Thus, the initiation enthalpy for bulges longer than 3 nt is approximated as the 3 nt bulge enthalpy.

Table 3
Bulge loop initiation enthalpy parametersa

Assuming that helical stacking is continuous between the adjacent helices for single bulges, but is interrupted by longer bulges (39,40), the enthalpies of bulge loops are calculated with the following equation:

ΔHbulgeo(n)=ΔHbulge initiationo(n)+ΔHbp stacko(only applied to 1 nt loops).

The calculation of enthalpies for the adjacent helices would include the terminal AU/GU penalty (21) for AU/GU pairs adjacent to the bulge loops that are longer than 1 nt. ΔHbp stack° is the canonical helix stacking enthalpy applied for the two closing base pairs as though the helix was not interrupted by the bulge loop.

Internal loops

Internal loop enthalpies were calculated from experimental data (17,18,3449) using the following equation:

ΔHinternal loopo=ΔHo(entire sequence with internal loop)ΔHo(reference sequence without internal loop)+ΔHbp stacko.

The range of measured enthalpies differs for internal loops of different size and symmetry; therefore, different enthalpy models are used to predict different loop types. The models are similar to those used to model free energies (17).

1 × 1 Internal loops (single mismatches)

For single non-canonical pairs (1 × 1 internal loops), the loop enthalpies are approximated by the following equation:

ΔHloopo(1×1)=ΔHloop initiationo(n=2)+ΔHAU/GUo(per AU or GU closure)+ΔHGGo(1×)1+ΔH5RU/3YUo(1×1),

where ΔHloop initiation°(n=2) is the enthalpy of initiation for a single non-canonical pair; ΔHAU/GU° is the penalty for each AU or GU closing base pair; ΔHGG°(1×1) is a bonus for a GG pair in a 1 × 1 loop; and ΔH5RU/3YUo(1×1) is a bonus for a 5′RU/3′YU stack in a 1 × 1 loop, where R is a purine and Y is a pyrimidine.

2 × 2 Internal loops (tandem mismatches)

The 2 × 2 internal loops, also called tandem mismatches, interrupt helical RNA with two opposing unpaired nucleotides on each strand. Many of the sequence-symmetric 2 × 2 loops have been studied experimentally (17,18, 3449) and their enthalpies are assembled in a ‘periodic table’ (Table 4). Symmetric sequences that have not been measured are approximated by averaging the most adjacent columns that have been measured. For asymmetric 2 × 2 loops, the enthalpies are approximated using the following equation:

ΔHloopo(2×2)(5PXYS/3QWZT)   =[ΔH37o(5PXWQ/3QWXP)   +ΔH37o(5TZYS/3SYZT)]/2+ΔHGGo+Δp,

where ΔHGGo (12.5 ± 2.7 kcal/mol) is applied to loops with a GG pair adjacent to an AA or any non-canonical pair with a pyrimidine and Δp (2.4 ± 3.1 kcal/mol) is applied to loops with an AG or GA pair adjacent to a UC, CU or CC pair or with a UU pair adjacent to an AA pair.

Table 4
The periodic table of tandem mismatch (2 × 2 internal loop) enthalpya

Other internal loops

The enthalpies of other internal loops are approximated using the following equation:

ΔHloopo(n)=ΔHloop initiationo(n)+ΔHAU/GU+n1n2ΔHasymo+ΔHfirst non-canonical pairso×[except for 1×(n1) for n>3],

where ΔHloop°(n) is the enthalpy of initiation for a loop of n nucleotides; ΔHasym° is a penalty for loops with unequal numbers of nucleotides on each side, with n1 and n2 the number of nucleotides on each side; ΔHfirst non-canonical pairs° is applied for each sequence-specific first mismatch (Table 5), but it is not applied to loops of the form 1 × (n − 1) with n > 3 (n is the total number of unpaired bases). Special first mismatch bonuses were determined for 2 × 3 and 1 × 2 internal loops with separate linear regressions.

Table 5
Approximations for internal loop enthalpy parameters at 37°C (in kcal/mol)

Moreover, the free energy parameters (Table 6) were updated for internal loops based on recent experimental measurements. The free energy parameters were obtained using the method of Mathews et al. (17). The recent data include the 3 × 3 loops from Chen et al. (41), but excluding the 3 × 3 loops with a middle GA pair. The middle GA pair is shown to enhance stability and this extra stability cannot be predicted by the nearest neighbor parameter set used in this work (41).

Table 6
Updated internal loop free energy parameters at 37°C (in kcal/mol)

Coaxial stacking

Coaxial stacking, which is a favorable interaction of two helices stacked end to end, occurs in multibranch loops and exterior loops. Stability increments for coaxial stacking were measured with a structure composed of a short oligonucleotide bound to a single-stranded end of a stem–loop structure, creating a helical interface (5255). The enthalpy of coaxial stacking is quantified as follows:

ΔHcoaxialo=ΔHo(duplex in context of stem-loops structure)ΔHo(duplex without stem-loop structure, predicted)+ΔHo(correction),

where ΔH°(correction) is the enthalpy for displacing a 3′ dangling end on the stem–loop structure if one is present.

When the helixes have no intervening mismatches, the enthalpy bonus is approximated by the nearest neighbor parameter (21) of a base pair in a helix. The excess enthalpy above the helical stacking nearest neighbor from Xia et al. (21), ΔHcoaxial°ΔHNN°, for each measured interface was calculated. With flush interfaces, i.e. with no intervening mismatch, and no strand extensions beyond the interface, the average excess enthalpy is −1.53 ± 1.45 kcal/mol. For interfaces followed by strand extensions, the excess enthalpy is 1.82 ± 1.13 kcal/mol. As the excess enthalpy changes are not statistically significant, coaxial stacking of helices with no intervening nucleotides is modeled with the enthalpy parameter in a helix.

With one intervening nucleotide from each strand, two helices can stack with an intervening mismatch between them. There are two stack increments: one is the mismatch stack at the end of one helix with continuous backbone, which is equal to the mismatch stacking parameter on a helix, and the other is the mismatch stack with discontinuous backbone, which is modeled as sequence independent. The average enthalpy of sequence independent stacks is −8.46 ± 2.75 kcal/mol. In addition to this, an enthalpy bonus of −0.4 or −0.2 kcal/mol are applied to intervening mismatches composed of nucleotides that could form a Watson–Crick or a GU base pair, respectively. These bonuses are identical to free energy increments that are used and are empirically found to improve structure prediction accuracy.

Multibranch loops

The parameters are determined by linear regression of experimental data for three- and four-way multibranch loops (50,51). In a nearest neighbor model, the bimolecular enthalpy (ΔHbimol°) for the formation of the duplex with a multibranch loop is given by the following equation:

ΔHbimolo=ΔHhelix1o+ΔHhelix2o+ΔHbimol inito+ΔHMBLoΔHproduct mmo,

where helix 1 and helix 2 are the intermolecular paired helices with ΔH° predicted from nearest neighbor parameters for Watson–Crick pairs (without including bimolecular initiation so that ΔHbimol init° appears only once). The ΔHproduct mmo is a term that accounts for the stacking enthalpy increment of the nucleotides that can stack on the hairpin loop stems to form a modified motif after the two strands have dissociated. This is the most favorable configuration with coaxial stacking of helixes (in the case of four-way multibranch loops) or of the stacking of unpaired nucleotides. ΔHbimol° is the experimental value which is taken from TM1 versus ln(CT/4) plots. The multibranch loop enthalpy initiation term (ΔHMBL init°) can be calculated from the above equation. The enthalpy of multibranch loops (ΔHMBL°) is then modeled as the sum of two terms, initiation and stacking:

ΔHMBLo=ΔHMBL initiationo+ΔHMBL stackingo.

The stacking term is the favorable enthalpy of coaxial stacking, terminal mismatch and/or dangling end stacking. It is determined from the stacking conformation that gives the lowest free energy, as determined by free energy nearest neighbors (50). The initiation term can be approximated by the following equation:

ΔHMBL initiationo=a+b×asym+c×h+ΔHstraino(three-way loops with fewer than two unpaired nucleotides),

where a, b and c are parameters determined from linear regression (Table 7) and h is the number of branching helices. ΔHstrain° is a strain enthalpy that only applies to three-way multibranch loops with fewer than two unpaired nucleotides. The asym term is the average asymmetry that reflects the distribution of unpaired nucleotides, which is defined by the following equation:

asym=min[2.0,(1hunpaired nucleotides 5-unpaired nucleotides 3)h].
Table 7
Enthalpy parameters for multibranch loop initiation

The average asymmetry is limited to 2.0, following the rules suggested by free energy parameters. Asymmetry cannot be applied, however, by dynamic programming algorithms for secondary structure prediction (17,22). Thus, the b term was excluded for secondary structure prediction and the parameters a and c were optimized by finding the parameters that lead to the highest average sensitivity of secondary structure prediction by free energy minimization. The maximum sensitivity of prediction was found with a = 30.0 kcal/mol and c = −2.2 kcal/mol.

Database of RNA secondary structures

The revised enthalpy nearest neighbor model was tested with RNA sequences with known secondary structure from organisms with known optimal growth temperature. The structures were taken from comparative analysis databases (4249,58,59). Small (16S) subunit rRNA sequences are divided into domains as defined by Jaeger et al. (39). Large (23S) subunit rRNA sequences are divided into domains of fewer than 700 nt each (18). The optimal growth temperatures of different organisms were taken from the Prokaryotic Growth Temperature Database (http://pgtdb.csie.ncu.edu.tw/) and the DSMZ German Collection of Microorganisms and Cell Cultures website (http://www.dsmz.de/). Only the RNA sequences of mesophiles (organisms living at temperatures between 10 and 60°C, but with organisms living at 37°C excluded) were chosen to test the sensitivity and positive predictive value (PPV) of secondary structure prediction. Considering that posttranscriptional modification (60) and high pressure (61) in the thermophiles and hyperthermophiles (organism living above 60°C) would change the thermodynamics of secondary structure formation, sequences from these organisms were excluded. A list of sequences and optimal growth temperatures used are available in Supplementary Data.

Accuracy of secondary structure prediction

The accuracy of structure prediction is determined by the sum of the canonical base pairs correctly predicted. A base pair is considered correctly predicted even if it is shifted by 1 nt on one side. For example a base pair between nucleotides i and j is considered to be correctly predicted if any of these base pairs is predicted: i to j, i to j − 1, i to j + 1, i − 1 to j or i + 1 to j. The predicted base pair between i − 1 and j + 1, however, is not considered to be correct. This scoring scheme reflects the uncertainty of exact base pair matches in comparative sequence analysis and the possibility for dynamics in base pairing. The values of sensitivity and PPV of this scoring scheme are ∼2–3% higher than when determined with exact base pairing only, where only the i to j base pair is considered to be correct. The prediction accuracies are shown in Supplementary Tables 11 and 12. Each table includes accuracies determined when pairs can be shifted and when pairs must be an exact match.

Availability of parameters

Machine-readable tables of the enthalpy parameters are available on the Mathews lab website (http://rna.urmc.rochester.edu/).


Nearest neighbor model parameters

In the nearest neighbor model of free energy (17,18), the parameters for Watson–Crick base pairs are well determined at 37°C with errors <10%, or ∼0.1–0.2 kcal/mol (21). For other motifs such as loops and GU base pairs, individual nearest neighbor free energy increments are often determined with an error <0.5 kcal/mol (17,18). In order to extend the current model to predict free energy at temperatures other than 37°C, enthalpy parameters consistent with the current nearest neighbor model are required. The free energy at arbitrary temperature for each parameter is then


where the enthalpy (ΔH°) and entropy (ΔS°) are assumed to be temperature independent. As described in Materials and Methods, parameters for enthalpy prediction, compatible with the free energy model, were determined using available experimental data from optical melting experiments.

Experimental studies consistently demonstrate that enthalpy and entropy measurements have considerably larger percent error than free energy measurements. Free energy at 37°C is determined with greater precision because of correlation between errors in enthalpy and entropy (21). The larger experimental errors in enthalpy result in larger percent errors for enthalpy nearest neighbor parameters than free energy parameters. The enthalpy of RNA secondary structure is known to be a function of temperature. A linear model for heat capacity change predicts the following:



where ΔCp° is a constant heat capacity change and T0 is a chosen reference temperature. It is hypothesized that the heat capacity change arises from the extent of stacking increasing with decreasing temperature. Thus, ΔCp° is negative because single strands are more organized at low rather than high temperature (6267). The ΔCp° can be estimated by linear fits of enthalpy and entropy changes as a function of melting temperature (50,51,62) or determined by isothermal titration calorimetry at multiple temperatures (68,69). However, the effects of heat capacity change on enthalpy and entropy are antagonistic in terms of free energy change:


Therefore, for certain ΔTT = TT0), ΔCp° can be neglected because the effects are compensated in terms of free energy. To calculate the compensation for a set of RNA duplexes (62), the free energy, ΔG°, was derived directly from Equation 4 assuming that the entropy and enthalpy were independent of temperature. Then the temperature-dependent free energy, ΔGT°, was calculated with the measured non-zero ΔCp° from Equations 24. The free energy difference, ΔΔG° = ΔGT° − ΔG°, increases with the deviation of temperature from T0 (37°C) (Figure 1). The exact ΔΔG° for each duplex is shown in Table 8 for different temperatures. The experimental error in individual loop free energy nearest neighbor parameters at 37°C is as large as 0.5 kcal/mol (17), which corresponds to roughly a factor of 2 in equilibrium constant. Thus, the small ΔΔG° for helices suggests that the approximation of ΔCp° = 0 is reasonable for predictions from ∼10 to 60°C. Therefore, the enthalpy parameters derived here assume ΔCp° = 0 and are most accurate at predicting free energy change close to 37°C.

Figure 1
(A) Free energy difference of RNA duplex CCGGUp. ΔG° (dashed line) was derived from Equation 3, where enthalpy and entropy were averaged from the optical melting curve fits, assuming that they were independent of the temperature. Δ ...
Table 8
Free energy differences of RNA duplexes

Dynamic programming algorithm for RNA secondary structure prediction

RNAstructure is a program for RNA secondary structure prediction and analysis. It includes prediction of secondary structure by free energy minimization (17), prediction of base pair probabilities using a partition function (22), the efn2 function for predicting the free energy change of folding given a sequence and secondary structure (18), and the Dynalign algorithm for finding the secondary structure common to two sequences (70). RNAstructure was revised to make predictions at user-defined temperature. Because large internal loops are more likely at high temperature, the previous limitation on internal loop size (fewer than 30 unpaired nucleotides) (17,18,22) was removed by implementing the method of Lyngsø et al. (71). This provides an O(N3) algorithm that can predict internal loops of arbitrary size. Benchmarks for calculation time and memory requirement with and without this revision are shown in Table 9.

Table 9
Calculation time and memory size of dynamic programming for sequences of different length

Sensitivities and PPVs of structure predictions

The enthalpy nearest neighbor parameters were compared with the previous parameters and model for enthalpy and free energy assembled by Serra and Turner (24) by predicting the secondary structures of RNA sequences with known secondary structures. Sensitivities, the percent of known base pairs that are correctly predicted, using both sets of parameters are shown in Figure 2 (detailed numbers are in Supplementary Table 11A) for different types of structural RNA sequences. The known structures of these sequences were taken from comparative analysis databases (4249,58,59). The average sensitivity is improved from 65.2 to 68.9% using the new parameters assembled here. Sensitivities are improved for most types of the RNA. The exceptions are 5S rRNA and Group II introns.

Figure 2
Improvement of prediction at optimal growth temperatures. The sequences are those from mesophiles (optimal growth temperature from 10 to 60°C) without organisms with optimal growth at 37°C. The lowest free energy secondary structures were ...

To test the enthalpy parameters, the accuracy of secondary structure prediction at optimal growth temperature was compared to the accuracy of structure prediction at 37°C for organisms that do not grow optimally at 37°C for several types of RNAs (Table 10). The comparison of predictions was shown in different groups divided by optimal growth temperature. The organisms in each group grow optimally in a certain range of temperatures. Compared to the prediction at 37°C, structure prediction at optimal growth temperature performs better for the organism living at temperatures between 22 and 37°C, but is worse at other optimal growth temperatures. This suggests that when enthalpy parameters are assumed to be temperature independent, their utility as a tool for deriving free energy parameters for use in predicting the lowest free energy structure is limited to a narrow temperature range. Small errors in enthalpy change parameters have a larger effect on free energy change parameter determination (Equation 1), the farther the temperature is from 37°C.

Table 10
Prediction sensitivities of the lowest free energy structurea

Figure 3 shows the PPV for base pairs from the lowest free energy structure for base pairs with different pairing probabilities (see detailed numbers in Supplementary Table 12A). They are predicted using a partition function calculation at optimal growth temperature (22). PPV is the percentage of predicted base pairs that are found in the known structure. The average PPV of all pairs in the lowest free energy structures is only 62.0%, which is lower than the sensitivity (68.9%). This suggests that the model over-predicts base pairs and/or that the base pairs may not be annotated completely in the structures from comparative analysis (22). For example, if a base pair is completely conserved, then it is sometimes not annotated by comparative analysis (4249,58,59). Base pair probabilities for all possible pairs are calculated with a partition function and grouped by different thresholds. The PPV is significantly higher for predicted base pairs in the lowest free energy structure with higher pairing probability. The average PPV is up to 90.4% for those known base pairs having probability of 0.99 or above. It has been demonstrated previously that base pair probabilities predicted at 37°C can be used to find pairs with high PPV (22). The fact that this holds true at other temperatures shows that the enthalpy parameters are robust for base pair probability prediction.

Figure 3
PPV for optimal structure and base pairs with different pairing probabilities. PPV equals the number predicted base pairs in that are in the known structure divided by total number of predicted base pairs. Pairs in the optimal structures are grouped by ...

The fact that the accuracy of secondary structure prediction is sensitive to the accuracy of the nearest neighbor parameters, but the base pair probabilities remain a robust measure of confidence for a wide variety of temperatures is consistent with a previous work. Layton and Bundschuh (72) demonstrated that the predicted lowest free energy structure was often changed in repeated structure predictions after random adjustments of the nearest neighbor parameters within the limits of their error. Base pair probabilities, however, were less perturbed by changes in the parameters (72). With the extrapolation of nearest neighbor parameters to temperatures far from 37°C, the accuracy of the predicted lowest free energy structure is often reduced as compared to structure prediction at 37°C. The ability of the partition function predicted base pair probabilities to determine base pairs predicted with a higher confidence is unchanged with secondary structure prediction at temperatures far from 37°C. This is because the determination of base pair probabilities is not as perturbed by errors in the nearest neighbor parameters.

An example of secondary structure prediction at 37°C and at optimal growth temperature of 30°C is shown in Figure 4 for a tRNA sequence. The base pairs with higher predicted pairing probability (color annotated according to pairing probability in Figure 4B and C) are pairs predicted with greater confidence. For this sequence, secondary structure prediction is more accurate and the fidelity of structure prediction (as judged by the percent of high probability pairs) is improved at optimal growth temperature.

Figure 4
Secondary structure prediction of Saccharomyces cerevisiae tRNA (RM4000) at optimal growth temperature (30°C) (B) and at 37°C (C) with the presented nearest neighbor parameters. Base pairs in the original structure (A) are derived from ...

Correlation between melting temperature and optimal growth temperature

Melting temperature, Tm, is defined as the temperature at which half of strands are unpaired. Assuming that an RNA melts with a two-state transition, the melting temperature (in Kelvins) of a single-stranded RNA structure can be predicted by Tm = ΔH°/ΔS° (73). For example, the predicted melting temperatures (°C) for all hairpins in the database of optically melted sequences (Supplementary Data) (2531) are plotted in Figure 5 as a function of experimentally determined Tm. This shows that the parameters adequately reflect the thermal stabilities of RNA sequences with known Tm. Better correlation was found at higher temperatures. This is expected because most hairpins were measured with high melting temperatures in experiments (2531).

Figure 5
Experimental (Supplementary Data) (2531) versus predicted (Tm = ΔH°/ΔS° − 273.15) melting temperatures of hairpin stem–loop structures. The line shows the ideal location of points, predicted Tm ...

Melting temperature reflects the thermal stability of a structure. Therefore RNA structures in organisms living at higher temperature are expected to have higher melting temperatures. Figure 6A shows a plot of predicted melting temperatures of the lowest free energy structure versus organism optimal growth temperature (10–90°C). A strong correlation (linear correlation coefficient of 0.797) is found between the melting temperature and the optimal growth temperature for different types of RNA structures. On the other hand, there appears to be less correlation between nucleotide content and optimal growth temperature (Figure 6B–D) for diverse types of RNA, although uracil content of 16S rRNA of thermophiles and psychrophiles were found recently to correlate inversely with their optimal growth temperatures (74). Evidently, the thermal stability of RNA structure is not simply controlled by base content. Organisms that grow at high temperature have apparently evolved RNA secondary structures with a combination of motifs that provide thermal stability.

Figure 6
Relationships of melting temperatures, nucleotide contents and optimal growth temperatures of different types of RNA in different organisms with optimal growth temperature from 10 to 90°C: (A) Predicted melting temperature; (B) G–C pair ...


The nearest neighbor parameters for enthalpy were derived here using similar rules as for free energy nearest neighbor parameters at 37°C (17). This makes these parameters useful for determining free energy parameters at arbitrary temperature that are compatible with dynamic programming algorithms for secondary structure prediction. Some of the enthalpy parameters have large percent standard errors as compared with the parameters of free energy. This reflects the larger errors in the experimental results of enthalpy than free energy, but it also suggests that enthalpy may be more sequence dependent than free energy. This sequence dependence cannot be determined using the currently available database of optical melting experiments and suggests a need for further optical melting experiments on model RNA systems.

Another source of error comes from the assumption that the enthalpy and entropy are independent of the temperature in both the model and in the analysis of optical melting experiments. When the temperature is too far from 37°C, the sensitivity of prediction is expected to be worse than 68.9% on average because of the approximation of ΔCp°=0. For example, experiments demonstrate cold denaturation of RNA (68,69), but the nearest neighbor model does not reproduce those results. Further experiments by isothermal titration calorimetry would be needed to provide the data for a model that can include a non-zero heat capacity change.

There are common error sources that should be considered for the prediction of base pairs. Free energy minimization assumes that the secondary structure is at equilibrium. The nearest neighbor model is an incomplete representation of structural free energy. The parameters average some sequence-specific effects and were derived from a limited set of experiments. Some RNA sequences, in particular mRNA, may sample multiple structures at equilibrium. The parameters are derived from experimental data at 1 M NaCl, whereas the salt concentration in different organisms may be very different.

In spite of all these limitations, the nearest neighbor model predicts secondary structures with a 72.8% average sensitivity (17). Recent experimental results on the self-folding of the 16S rRNA 5′ domain (75) support the assumption of thermodynamic control of folding pathway. Moreover, the base pair prediction with the partition function can be used to determine pairs predicted with greater confidence (22).

In spite of the fact that the enthalpy parameters have larger percent errors than the free energy parameters for 37°C, the enthalpy parameters are able to predict optical melting temperatures for small model sequences. Predicted melting temperatures for structural RNA sequences correlate well with optimal growth temperature, suggesting that these parameters capture many of the sequence-dependent features of RNA folding enthalpy change.


Supplementary Data are available at NAR Online.


The authors thank Rahul Tyagi and Andrew V. Uzilov for helpful discussions. D.H.M. is an Alfred P. Sloan Research Fellow. This work was supported by National Institutes of Health Grants GM22939 to D.H.T. and GM076485 to D.H.M. Funding to pay the Open Access publication charges for this article was provided by National Institutes of Health.

Conflict of interest statement. None declared.


1. Nelson P., Kiriakidou M., Sharma A., Maniataki E., Mourelatos Z. The microRNA world: small is mighty. Trends Biochem. Sci. 2003;28:534–540. [PubMed]
2. Doudna J., Cech T. The chemical repertoire of natural ribozymes. Nature. 2002;418:222–228. [PubMed]
3. Walter P., Blobel G. Signal recognition particle contains a 7S RNA essential for protein translocation across the endoplasmic reticulum. Nature. 1982;299:691–698. [PubMed]
4. Lau N.C., Lim L.P., Weinstein E.G., Bartel D.P. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science. 2001;294:858–862. [PubMed]
5. Lagos-Quintana M., Rauhut R., Lendeckel W., Tuschl T. Identification of novel genes coding for small expressed RNAs. Science. 2001;294:853–858. [PubMed]
6. Onoa B., Tinoco I., Jr RNA folding and unfolding. Curr. Opin. Struct. Biol. 2004;14:374–379. [PubMed]
7. Crothers D.M., Cole P.E., Hilbers C.W., Schulman R.G. The molecular mechanism of thermal unfolding of Escherichia coli formylmethionine transfer RNA. J. Mol. Biol. 1974;87:63–88. [PubMed]
8. Mathews D.H., Banerjee A.R., Luan D.D., Eickbush T.H., Turner D.H. Secondary structure model of the RNA recognized by the reverse transcriptase from the R2 retrotransposable element. RNA. 1997;3:1–16. [PMC free article] [PubMed]
9. Banerjee A.R., Jaeger J.A., Turner D.H. Thermal unfolding of a group I ribozyme: the low temperature transition is primarily a disruption of tertiary structure. Biochemistry. 1993;32:153–163. [PubMed]
10. Woodson S.A. Recent insights on RNA folding mechanisms from catalytic RNA. Cell Mol. Life Sci. 2000;57:796–808. [PubMed]
11. James B.D., Olsen G.J., Pace N.R. Phylogenetic comparative analysis of RNA secondary structure. Methods Enzymol. 1989;180:227–239. [PubMed]
12. Pace N.R., Thomas B.C., Woese C.R. Probing RNA structure, function, and history by comparative analysis. In: Gesteland R.F., Cech T.R., Atkins J.F., editors. The RNA World. 2nd edn. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 1999. pp. 113–141.
13. Woese C.R., Gutell R.R., Gupta R., Noller H.F. Detailed analysis of the higher order structure of 16S-like ribosomal ribonucleic acids. Microbiol. Rev. 1983;47:621–669. [PMC free article] [PubMed]
14. Andronescu M., Aguirre-Hernandez R., Condon A., Hoos H.H. RNAsoft: A suite of RNA secondary structure prediction and design software tools. Nucleic Acids Res. 2003;31:3416–3422. [PMC free article] [PubMed]
15. Hofacker I.L. Vienna RNA secondary structure server. Nucleic Acids Res. 2003;31:3429–3431. [PMC free article] [PubMed]
16. Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003;31:3406–3415. [PMC free article] [PubMed]
17. Mathews D.H., Disney M.D., Childs J.L., Schroeder S.J., Zuker M., Turner D.H. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc. Natl Acad. Sci. USA. 2004;101:7287–7292. [PMC free article] [PubMed]
18. Mathews D.H., Sabina J., Zuker M., Turner D.H. Expanded sequence dependence of thermodynamic parameters provides improved prediction of RNA Secondary Structure. J. Mol. Biol. 1999;288:911–940. [PubMed]
19. Turner D.H. Conformational changes. In: Bloomfield V., Crothers D., Tinoco I., editors. Nucleic Acids. Sausalito, CA: University Science Books; 2000. pp. 259–334.
20. Xia T., Mathews D.H., Turner D.H. Thermodynamics of RNA secondary structure formation. In: Soll D.G., Nishimura S., Moore P.B., editors. Prebiotic Chemistry, Molecular Fossils, Nucleosides, and RNA. NY: Elsevier; 1999. pp. 21–47.
21. Xia T., SantaLucia J., Jr, Burkard M.E., Kierzek R., Schroeder S.J., Jiao X., Cox C., Turner D.H. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson–Crick pairs. Biochemistry. 1998;37:14719–14735. [PubMed]
22. Mathews D.H. Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA. 2004;10:1178–1190. [PMC free article] [PubMed]
23. McCaskill J.S. The equilibrium partition function and base pair probabilities for RNA secondary structure. Biopolymers. 1990;29:1105–1119. [PubMed]
24. Serra M.J., Turner D.H. Predicting thermodynamic properties of RNA. Methods Enzymol. 1995;259:242–261. [PubMed]
25. Giese M.R., Betschart K., Dale T., Riley C.K., Rowan C., Sprouse K.J., Serra M.J. Stability of RNA hairpins closed by wobble base pairs. Biochemistry. 1998;37:1094–1100. [PubMed]
26. Serra M.J., Lyttle M.H., Axenson T.J., Schadt C.A., Turner D.H. RNA hairpin loop stability depends on closing pair. Nucleic Acids Res. 1993;21:3845–3849. [PMC free article] [PubMed]
27. Serra M.J., Axenson T.J., Turner D.H. A model for the stabilities of RNA hairpins based on a study of the sequence dependence of stability for hairpins of six nucleotides. Biochemistry. 1994;33:14289–14296. [PubMed]
28. Serra M.J., Barnes T.W., Betschart K., Gutierrez M.J., Sprouse K.J., Riley C.K., Stewart L., Temel R.E. Improved parameters for the prediction of RNA hairpin stability. Biochemistry. 1997;36:4844–4851. [PubMed]
29. Groebe D.R., Uhlenbeck O.C. Characterization of RNA hairpin loop stability. Nucleic Acids Res. 1988;16:11725–11735. [PMC free article] [PubMed]
30. Dale T., Smith R., Serra M.J. A test of the model to predict unusually stable RNA hairpin loop stability. RNA. 2000;6:608–615. [PMC free article] [PubMed]
31. Antao V.P., Tinoco I., Jr Thermodynamic parameters for loop formation in RNA and DNA hairpin tetraloops. Nucleic Acids Res. 1992;20:819–824. [PMC free article] [PubMed]
32. Longfellow C.E., Kierzek R., Turner D.H. Thermodynamic and spectroscopic study of bulge loops in oligoribonucleotides. Biochemistry. 1990;29:278–285. [PubMed]
33. Znosko B.M., Silvestri S.B., Volkman H., Boswell B., Serra M.J. Thermodynamic parameters for an expanded nearest-neighbor model for the formation of RNA duplexes with single nucleotide bulges. Biochemistry. 2002;41:10406–10417. [PubMed]
34. Shu Z., Bevilacqua P.C. Isolation and characterization of thermodynamically stable and unstable RNA hairpins from a triloop combinatorial library. Biochemistry. 1999;38:15369–15379. [PubMed]
35. Proctor D.J., Schaak J.E., Bevilacqua J.M., Falzone C.J., Bevilacqua P.C. Isolation and characterization of stable tetraloops with the motif YNMG that participates in tertiary interactions. Biochemistry. 2002;41:12062–12075. [PubMed]
36. Laing L.G., Hall K.B. A model of the iron responsive element RNA hairpin loop structure determined from NMR and thermodynamic data. Biochemistry. 1996;35:13586–13596. [PubMed]
37. Groebe D.R., Uhlenbeck O.C. Thermal stability of RNA hairpins containing a four-membered loop and a bulge nucleotide. Biochemistry. 1989;28:742–747. [PubMed]
38. Fink T.R., Crothers D.M. Free energy of imperfect nucleic acid helices,I. The bulge defect. J. Mol. Biol. 1972;66:1–12. [PubMed]
39. Jaeger J.A., Turner D.H., Zuker M. Improved predictions of secondary structures for RNA. Proc. Natl Acad. Sci. USA. 1989;86:7706–7710. [PMC free article] [PubMed]
40. Weeks K.M., Crothers D.M. Major groove accessibility of RNA. Science. 1993;261:1574–1577. [PubMed]
41. Chen G., Znosko B.M., Jiao X., Turner D.H. Factors affecting thermodynamic stabilities of RNA 3 × 3 internal loops. Biochemistry. 2004;43:12865–12876. [PubMed]
42. Gutell R.R. Collection of small subunit (16S- and 16S-like) ribosomal RNA structures. Nucleic Acids Res. 1994;22:3502–3507. [PMC free article] [PubMed]
43. Gutell R.R., Gray M.W., Schnare M.N. A compilation of large subunit (23S- and 23S-like) ribosomal RNA structures. Nucleic Acids Res. 1993;21:3055–3074. [PMC free article] [PubMed]
44. Schnare M.N., Damberger S.H., Gray M.W., Gutell R.R. Comprehensive comparison of structural characteristics in eukaryotic cytoplasmic large subunit (23S-like) ribosomal RNA. J. Mol. Biol. 1996;256:701–719. [PubMed]
45. Szymanski M., Specht T., Barciszewska M.Z., Barciszewski J., Erdmann V.A. 5S rRNA data bank. Nucleic Acids Res. 1998;26:156–159. [PMC free article] [PubMed]
46. Sprinzl M., Horn C., Brown M., Ioudovitch A., Steinberg S. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 1998;26:148–153. [PMC free article] [PubMed]
47. Larsen N., Samuelsson T., Zwieb C. The signal recognition particle database (SRPDB) Nucleic Acids Res. 1998;26:177–178. [PMC free article] [PubMed]
48. Brown J.W. The ribonuclease P database. Nucleic Acids Res. 1998;26:351–352. [PMC free article] [PubMed]
49. Damberger S.H., Gutell R.R. A comparative database of group I intron structures. Nucleic Acids Res. 1994;22:3508–3510. [PMC free article] [PubMed]
50. Mathews D.H., Turner D.H. Experimentally derived nearest neighbor parameters for the stability of RNA three- and four-way multibranch loops. Biochemistry. 2002;41:869–880. [PubMed]
51. Diamond J.M., Turner D.H., Mathews D.H. Thermodynamics of three-way multibranch loops in RNA. Biochemistry. 2001;40:6971–6981. [PubMed]
52. Walter A.E., Wu M., Turner D.H. The stability and structure of tandem GA mismatches in RNA depend on closing base pairs. Biochemistry. 1994;33:11349–11354. [PubMed]
53. Walter A.E., Turner D.H., Kim J., Lyttle M.H., Miller P., Mathews D.H., Zuker M. Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding. Proc. Natl Acad. Sci. USA. 1994;91:9218–9222. [PMC free article] [PubMed]
54. Kim J., Walter A.E., Turner D.H. Thermodynamics of coaxially stacked helices with GA and CC mismatches. Biochemistry. 1996;35:13753–13761. [PubMed]
55. SantaLucia J., Jr, Kierzek R., Turner D.H. Functional group substitutions as probes of hydrogen bonding between GA mismatches in RNA internal loops. J. Am. Chem. Soc. 1991;113:4313–4322.
56. Burkard M.E., Turner D.H. NMR structures of r(GCAGGCGUGC)2 and determinants of stability for single guanosine–guanosine base pairs. Biochemistry. 2000;39:11748–11762. [PubMed]
57. Jacobson H., Stockmayer W.H. Intramolecular reaction in polycondensations. I. The theory of linear systems. J. Chem. Phys. 1950;18:1600–1606.
58. Michel F., Umesono K., Ozeki H. Comparative and functional anatomy of group II catalytic introns—a review. Gene. 1989;82:5–30. [PubMed]
59. Waring R.B., Davies R.W. Assessment of a model for intron RNA secondary structure relevant to RNA self-splicing—a review. Gene. 1984;28:277–291. [PubMed]
60. Kowalak J.A., Dalluge J.J., McCloskey J.A., Stetter K.O. The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochemistry. 1994;33:7869–7876. [PubMed]
61. Dubins D.N., Lee A., Macgregor R.B., Jr, Chalikian T.V. On the stability of double stranded nucleic acids. J. Am. Chem. Soc. 2001;123:9254–9259. [PubMed]
62. Petersheim M., Turner D.H. Base-stacking and base-pairing contributions to helix stability: thermodynamics of double-helix formation with CCGG, CCGGp, CCGGAp, ACCGGp, CCGGUp, and ACCGGUp. Biochemistry. 1983;22:256–268. [PubMed]
63. Holbrook J.A., Capp M.W., Saecker R.M., Record M.T., Jr Enthalpy and heat capacity changes for formation of an oligomeric DNA duplex: interpretation in terms of coupled processes of formation and association of single-stranded helices. Biochemistry. 1999;38:8409–8422. [PubMed]
64. Suurkuusk J., Alvarez J., Freire E., Biltonen R. Calorimetric determination of the heat capacity changes associated with the conformational transitions of polyriboadenylic acid and polyribouridylic acid. Biopolymers. 1977;16:2641–2652. [PubMed]
65. Pörschke D., Uhlenbeck O.C., Martin F.H. Thermodynamics and kinetics of the helix-coil transition of oligomers containing GC base pairs. Biopolymers. 1973;12:1313–1335.
66. Appleby D.W., Kallenbach N.R. Theory of oligonucleotide stabilization. I. The effect of single-strand stacking. Biopolymers. 1973;12:2093–2120. [PubMed]
67. Freier S.M., Hill K.D., Dewey T.G., Marky L.A., Breslauer K.J., Turner D.H. Solvent effects on the kinetics and thermodynamics of stacking in poly(cytidylic acid) Biochemistry. 1981;20:1419–1426. [PubMed]
68. Takach J.C., Mikulecky P.J., Feig A.L. Salt-dependent heat capacity changes for RNA duplex formation. J. Am. Chem. Soc. 2004;126:6530–6531. [PMC free article] [PubMed]
69. Mikulecky P.J., Takach J.C., Feig A.L. Entropy-driven folding of an RNA helical junction: an isothermal titration calorimetric analysis of the hammerhead ribozyme. Biochemistry. 2004;43:5870–5881. [PMC free article] [PubMed]
70. Mathews D.H. Predicting a set of minimal free energy RNA secondary structures common to two sequences. Bioinformatics. 2005;21:2246–2253. [PubMed]
71. Lyngsø R., Zuker M., Pederson C. Fast evaluation of internal loops in RNA secondary structure prediction. Bioinformatics. 1999;15:440–445. [PubMed]
72. Layton D.M., Bundschuh R. A statistical analysis of RNA folding algorithms through thermodynamic parameter perturbation. Nucleic Acids Res. 2005;33:519–524. [PMC free article] [PubMed]
73. Borer P.N., Dengler B., Tinoco I., Jr, Uhlenbeck O.C. Stability of ribonucleic acid double-stranded helices. J. Mol. Biol. 1974;86:843–853. [PubMed]
74. Khachane A.N., Timmis K.N., dos Santos V.A. Uracil content of 16S rRNA of thermophilic and psychrophilic prokaryotes correlates inversely with their optimal growth temperatures. Nucleic Acids Res. 2005;33:4016–4022. [PMC free article] [PubMed]
75. Adilakshmi T., Ramaswamy P., Woodson S.A. Protein-independent folding pathway of the 16S rRNA 5′ domain. J. Mol. Biol. 2005;351:508–519. [PubMed]
76. Wu M., McDowell J.A., Turner D.H. A periodic table of symmetric tandem mismatches in RNA. Biochemistry. 1995;34:3204–3211. [PubMed]
77. SantaLucia J., Jr, Kierzek R., Turner D.H. Stabilities of consecutive A·C, C·C, G·G, U·C, and U·U mismatches in RNA internal loops: evidence for stable hydrogen-bonded U·U and C·C+ pairs. Biochemistry. 1991;30:8242–8251. [PubMed]
78. Znosko B.M., Burkard M.E., Krugh T.R., Turner D.H. Molecular recognition in purine-rich internal loops: thermodynamic, structural, and dynamic consequences of purine for adenine substitutions in 5′ (rGGCAAGCCU)2. Biochemistry. 2002;41:14978–14987. [PubMed]
79. Schroeder S.J., Turner D.H. Thermodynamic stabilities of internal loops with GU closing pairs in RNA. Biochemistry. 2001;40:11509–11517. [PubMed]
80. Xia T., McDowell J.A., Turner D.H. Thermodynamics of nonsymmetric tandem mismatches adjacent to G·C base pairs in RNA. Biochemistry. 1997;36:12486–12487. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem chemical substance records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records.
  • Taxonomy
    Taxonomy records associated with the current articles through taxonomic information on related molecular database records (Nucleotide, Protein, Gene, SNP, Structure).
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...