# A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighborthermodynamics

^{*}To whom reprint requests should be addressed. e-mail: ude.enyaw.mehc@lsj.

## Abstract

A unified view of polymer, dumbbell, and oligonucleotide
nearest-neighbor (NN) thermodynamics is presented. DNA NN
ΔG°_{37} parameters from seven laboratories are
presented in the same format so that careful comparisons can be made.
The seven studies used data from natural polymers, synthetic polymers,
oligonucleotide dumbbells, and oligonucleotide duplexes to derive NN
parameters; used different methods of data analysis; used different
salt concentrations; and presented the NN thermodynamics in different
formats. As a result of these differences, there has been much
confusion regarding the NN thermodynamics of DNA polymers and
oligomers. Herein I show that six of the studies are actually in
remarkable agreement with one another and explanations are provided in
cases where discrepancies remain. Further, a single set of parameters,
derived from 108 oligonucleotide duplexes, adequately describes polymer
and oligomer thermodynamics. Empirical salt dependencies are also
derived for oligonucleotides and polymers.

The application of the nearest-neighbor (NN) model to nucleic acids was pioneered by Zimm (1) and by Tinoco and coworkers (2–6). Subsequently, several experimental and theoretical papers on DNA and RNA NN thermodynamics have appeared (7–22). There has been disagreement concerning a number of issues, particularly differences between DNA polymer and oligonucleotide NN thermodynamic trends and the salt dependence of nucleic acid denaturation. These differences have led to the notion that there is a “length dependency” to DNA thermodynamics (18). In this article, I show that there is a length dependence to salt effects but not for the NN propagation energies. Instead, a single set of parameters derived from 108 oligonucleotide duplexes (22) adequately describes polymer and oligonucleotide behavior.

The major sources of confusion in the literature are that the
different studies use different oligonucleotide and polymer design,
different methods for determining thermodynamics, different methods for
analyzing data, different salt conditions, and different formats for
presenting the NN parameters. In this article, the results from seven
studies (7, 10, 12, 17, 18, 20, 21) are presented in the same format so
that direct comparisons can be made. These data are compared with the
recently compiled “unified” oligonucleotide NN parameters based
on a collection of 108 oligonucleotide duplexes from the literature
(22). This work emphasizes the Δ*G*°_{37}
parameters because Δ*G*°_{37} is more
accurate than Δ*H*° or Δ*S*° due to
compensating errors (22). Remarkably, there is consensus agreement
among the parameters determined from six laboratories (7, 10, 17, 18,
20, 21).

### Background

#### The NN Model.

The NN model for nucleic acids assumes that the
stability of a given base pair depends on the identity and orientation
of neighboring base pairs. Throughout this paper, the 10 NN dimer
duplexes are represented with a slash separating strands in
antiparallel orientation (e.g., AC/TG means 5′-AC-3′ Watson–Crick
base-paired with 3′-TG-5′). For oligonucleotide duplexes, additional
parameters for the initiation of duplex formation are introduced.
Importantly, all other sequence-independent effects are also combined
into the initiation parameter including differences between terminal
and internal NNs (23) and counterion condensation (24, 25). To account
for differences between duplexes with terminal AT vs. terminal
GC pairs, two initiation parameters are introduced (22, 23):
“initiation with terminal GC” and “initiation with
terminal AT”. An additional entropic penalty (26) for the
maintenance of the C2 symmetry of self-complementary duplexes is also
included. The total Δ*G*°_{37} is given
by:

where Δ*G*°(*i*) are the standard
free-energy changes for the 10 possible Watson–Crick NNs (e.g.,
Δ*G*°(1) =
Δ*G*°_{37}(AA/TT),
Δ*G*°(2) =
Δ*G*°_{37}(TA/AT), … etc.),
*n*_{i} is the number of occurrences of each nearest
neighbor, *i*, and Δ*G*°(sym) equals +0.43
kcal/mol (1 cal = 4.184 J) if the duplex is self-complementary
and zero if it is non-self-complementary.

#### Application of the Unified NN Parameters.

Fig.
Fig.11 illustrates the calculation of
Δ*G*°_{37} for the sequence
CGTTGATCAACG using the unified NN parameters (22) in Table
Table1.1. The Δ*H*° and
Δ*S*° parameters are analogously calculated from the
parameters in Table Table22 (22). The
Δ*G*°_{37} can also be calculated from
Δ*H*° and Δ*S*° parameters by using the
equation:

If large temperature extrapolation from 37°C is required, then
the difference between the heat capacities of the folded and denatured
states, Δ*C*°_{p}, should be accounted
for (27, 28). Previous data have indicated that
Δ*C*°_{p} is usually small for nucleic
acids (29, 30). Due to enthalpy–entropy compensation,
Δ*G*°_{37} is relatively insensitive to
Δ*C*°_{p}.

*G*°

_{37}. Each arrow points to the middle of one of the NN dimers. The duplex CGTTGATCAACG is non-self-complementary and thus

**...**

#### Prediction of the Melting Temperature
*T*_{M}.

*T*_{M} is
defined as the temperature at which half of the strands are in the
double-helical state and half are in the “random-coil” state. For
self-complementary oligonucleotide duplexes, the
*T*_{M} is calculated from the predicted
Δ*H*° and Δ*S*° and the total oligonucleotide
strand concentration *C*_{T}, by using the equation:

where *R* is the gas constant (1.987
cal/Kmol). For non-self-complementary molecules,
*C*_{T} in Eq. 3 is replaced by
*C*_{T}/4 if the strands are in equal concentration
or by (*C*_{A} − *C*_{B}/2) if
the strands are at different concentrations, where
*C*_{A} and *C*_{B} are the
concentrations of the more concentrated and less concentrated strands,
respectively. Synthetic polymers with simple repeat sequences usually
melt in a single cooperative transition (approximately two-state) that
is concentration-independent so the *T*_{M} =
Δ*H*°/Δ*S*°. Natural polymers with
heterogeneous sequences, on the other hand, usually melt with many
stable intermediate states (non-two-state) and accurate prediction
requires a statistical mechanical partition function approach (11,
31–33).

### Methods

#### Formats of the NN Model.

Three formats for presenting NN
thermodynamic data are as follows: (*i*) the 10 NN base pair
dimer-stacking energies approach (4, 12, 13, 20, 21), (*ii*)
the linearly independent sequences approach (also known as the
“polymer approach”) (3, 10, 19), and (*iii*) the
“independent short sequences” (ISS) approach, which presents the
oligonucleotide NN data in an irreducible representation (23). All
three of these methods are valid and provide equivalent predictions
within round-off error (J.S. and D. M. Gray, unpublished result).
In this work, the NN parameters from the literature are cast in the
dimer-stacking format because it is easier for nonexperts to apply,
because all 10 NN parameters are required to fully describe
oligonucleotide melting, and because the ISS model does not apply to
polymers.

#### The Rank of the Stacking Matrix.

An understanding of the
effects of the rank of the stacking matrix is essential to reconcile
the literature NN parameters derived from polymers with those derived
from oligonucleotides. The stacking matrix **S** has dimensions
*M* × *N* for a data set of *M*
sequences and the columns *N* contain the number of
occurrences in each DNA sequence of the 10 NN dimers plus initiation
parameters for duplexes. The column rank of the stacking matrix
determines the maximum number of parameters that can be uniquely
determined (invariants) (19, 34). Because of constraints on the NN
composition (23), there are only eight invariants for polymers that are
usually expressed as linearly independent sequences. A convenient set
of eight linearly independent sequences was provided by Vologodskii
*et al.* (10):

In this set, *P* is a measurable property such as
Δ*G*°_{37}, Δ*H*°, or
Δ*S*°. Linear combinations of these eight invariants can
be used to completely describe the behavior of any DNA polymer within
the limits of the NN model. For oligonucleotide dumbbells with fixed
termini but different lengths, nine invariants can be determined (18,
19). To fully characterize the thermodynamics of oligonucleotide
duplexes, parameters for all 10 NN dimers (plus initiation parameters)
are required. Importantly, a set of sequences with a rank of 8 (or 9
for dumbbells) can be used to derive a set of the 10 NN dimer energies
that are a linear least-squares fit of the data set, but the solution
is not unique. To verify that the solution is not unique, one can add a
constant, *C*, to one of the dimers [other than AA/TT or
GG/CC, which are uniquely determined for both polymers and oligomers
(18)] and then add or subtract *C* or zero from the other
dimers subject to the constraints of the eight linearly independent
sequences. An alternative solution is obtained that makes exactly equal
predictions as the first solution but with different trends in the NNs.
The method of singular value decomposition (SVD) (20, 22, 34) provides
the solution with *C* equal to zero and represents the minimum
sequence dependence of the 10 dimers consistent with the eight
invariants. Other methods for obtaining a linear least-squares fit of
the data in terms of 10 parameters, particularly iterative methods
(e.g., Gauss elimination or Gauss–Jordan iteration with back
substitution) provide solutions with a nonzero and arbitrary value of
*C*; the 10 dimers from these methods have an artificially
larger sequence dependence than that determined by SVD. The solution
obtained from these other methods, however, is a linear least-squares
fit and makes predictions that are equal to those of the parameter set
obtained with SVD. These points are at the heart of reconciling the NN
data sets of Gotoh and Tagashira (7) and of Delcourt and Blake (17)
with the oligonucleotide NN parameters (see below).

#### Converting Polymer Stability Temperatures to Energies.

Several
studies (7, 10, 17, 18) present the NN stabilities in terms of Kelvin
temperatures, *T*(*i*), where *i* are the NN
dimers, instead of free energy changes,
Δ*G*°(*i*). The dimer stacking
Δ*H*°(*i*) can be calculated from
*T*(*i*) with Eq. 4:

The polymer studies assume that the dimer propagation
Δ*S*°(*i*) is −24.85 ± 1.74
cal/Kmol for all stacking dimers and is independent of the salt
concentration (17, 18). The dimer
Δ*G*°_{37} values are then calculated
with Eq **2**. For example, table 2 of Delcourt and Blake (17)
lists *T*(TA/AT) = 56.31°C, which gives
Δ*H*°(TA/AT) = (56.31 + 273.15) × −24.85 e.u. =
−8,187 cal/mol (where e.u. is entropy unit). Using Eq. 2
gives Δ*G*°_{37} = −8,187 cal/mol
− 310.15 × −24.85 e.u. = −0.48 kcal/mol. This conversion was
performed for all of the NN parameters given in Delcourt and Blake
(17), Vologodskii *et al.* (10), and Gotoh and Tagashira (7).
To remove the *C* contribution, the literature NN parameters
(7, 10, 17) were used to calculate the eight invariants given above
(*P*_{1} through *P*_{8}) and then
SVD was used to produce a new set of NN parameters with *C*
equal to zero (Table (Table11).

The studies of Vologodskii *et al.* (10) and Doktycz
*et al.* (18) separated the dimer temperature stabilities into
“hydrogen bonding” contributions
*T*_{HB}(*i*) that are dependent on percent
G+C content and “dimer-stacking” contributions
*T*_{ST}(*i*) that are perturbations that
contain the NN sequence dependence (these terms also contain the other
fundamental interactions such as electrostatics and conformational
entropy). The experimental work of Frank-Kamenetskii (35) indicated
that hydrogen bonding contributions are salt-dependent and calculated
as follows:

where *T*_{HB}(AT) and
*T*_{HB}(GC) are the hypothetical melting
temperatures of isolated AT and GC pairs without any
stacking interactions and with the initiation parameter set to zero
(polymer behavior). Thus, *T*_{HB}(AA/TT),
*T*_{HB}(AT/TA), and
*T*_{HB}(TA/AT) are given by Eq. 5a;
*T*_{HB}(GG/CC), T_{HB}(GC/CG),
and T_{HB}(CG/GC) are given by Eq. 5a; and
*T*_{HB}(CA/GT),
*T*_{HB}(AC/TG),
*T*_{HB}(GA/CT), and
*T*_{HB}(AG/TC) are given by the average of Eqs.
5a and 5b. Eqs. 5a and 5b
were used to analyze the data of Vologodskii *et al.* (10).
Doktycz *et al.* (18) performed their dumbbell study in 0.115
M Na^{+} and found experimentally that
*T*_{HB}(AT) and
*T*_{HB}(GC) were 339.67 K and 383.67 K,
respectively. Table Table33 of Vologodskii
*et al.* (10) lists *T*_{ST}(*i*),
and other works (17, 18) list the stacking perturbations as
δΔ*G*°(*i*). Eq. 6 allows
calculation of stability temperatures *T*(*i*) from
*T*_{HB}(*i*) and
*T*_{ST}(*i*) or
δΔ*G*°(*i*).

For example, the study of Doktycz *et al.* (18) reported
δΔ*G*°(AA/TT) as −196 cal/mol. Using Eqs. **6,
4,** and 2 for the AA/TT stack gives:

and

All other parameters in Table Table11 in the columns for Gotoh and
Tagashira (7), Vologodskii *et al.* (10), Delcourt and Blake
(17), and Doktycz *et al.* (18) were calculated similarly.

### Results and Discussion

Table Table11 presents the NN
Δ*G*°_{37} values for helix propagation
and initiation from eight experimental studies. With all these data
sets presented in a uniform format, a remarkable consensus is
immediately evident. The qualitative trend observed in order of
decreasing stability is GC/CG = CG/GC > GG/CC >
CA/GT = GT/CA = GA/CT = CT/GA >
AA/TT > AT/TA > TA/AT. To quantify the quality of the
NN parameters, linear regression analysis was performed with the
literature NN as the dependent variable (*y* axis) and the
unified oligonucleotide NN parameters as the independent variable
(*x* axis) (Table (Table3).3). The slope of this plot indicates how
close the range (i.e., the difference between the largest and the
smallest NN parameters) of the literature NN agrees with the range
observed in the unified NN parameters. The intercept indicates the
quality of the initiation parameter and salt dependence and also
contains a contribution from the slope. The correlation coefficient
*R*^{2} indicates the quality of the trend in NN
parameters. The polymer studies (7, 10, 17) show a remarkable
correlation with the unified oligonucleotide parameters
(*R*^{2} = 0.97, 0.98, 0.95, respectively). The
slopes are close to one for each of these studies, indicating that the
ranges are in good agreement with the unified parameters. The
intercepts show a systematic sodium concentration dependence (see
below). The oligonucleotide-duplex-derived NN parameters of SantaLucia
(20) and Sugimoto (21) are in excellent agreement with the unified
parameters, which is not surprising because the data from these studies
make up the majority of the unified data set. The poor agreement of the
Breslauer NN parameters (12) (Tables (Tables11 and and3)3) is discussed below. The
oligonucleotide dumbbell parameters of Benight (18) also show good
agreement with the unified parameters.

#### Why Do Rank-Deficient Polymer NN Parameters Agree with Rank-Determinant Oligonucleotide NN Parameters?

It is surprising
that polymer parameters with a rank of 8 are observed to agree so well
with the 10 oligonucleotide dimers. The rationale for this observation
is that most of the sequence dependence of oligonucleotide DNA
thermodynamics is captured in the first eight terms and the remaining
two terms are small perturbations that are difficult to detect within
the error limits of most measurement techniques. The slopes given in
Table Table33 reveal that the polymer NN have a slightly smaller range in NN
Δ*G*°_{37} that is primarily due to the
rank deficiency of the polymer parameters. For example, the range in
the unified Δ*G*°_{37} parameters is 1.66
kcal/mol (TA/AT − GC/CG), whereas for
Δ*G*°_{37} parameters from Vologodskii
*et al.* (10), the range is only 1.32 kcal/mol (TA/AT
− CG/GC). This extra sequence dependence for the oligomer NN has
almost no effect on the eight invariants needed to predict polymer
thermodynamics. A general result from this is that the oligomer NN can
predict polymer behavior accurately, but the polymer NN data cannot be
used to reliably predict oligomer behavior.

#### Salt Dependence of Oligonucleotides.

Recently, we have
reanalyzed the literature thermodynamic data for 26 oligonucleotide
duplexes dissolved in 0.01 M to 0.3 M NaCl (see ref. 20 and references
therein). The salt correction for DNA is assumed to be independent of
sequence but to be dependent on oligonucleotide length (36). The
difference between the thermodynamics of 26 literature duplexes
dissolved in different sodium concentrations and the NN predictions in
1 M NaCl was plotted vs. *N* × ln[Na^{+}]
with the intercept forced through zero. A linear least-squares fit of
this plot gives:

where Δ*G*°_{37}(oligomer,
[Na^{+}]) is the Δ*G*°_{37}
for an oligonucleotide duplex dissolved in a given sodium
concentration, Δ*G*°_{37}(unified
oligomer, 1 M NaCl) is the Δ*G*°_{37}
predicted from the unified NN parameters at 1 M NaCl, and *N*
is the total number of phosphates in the duplex divided by 2 (e.g., for
an 8-bp duplex without terminal phosphates, *n* = 7). The
length dependence in Eq. 7 neglects differential cation
binding in the middle vs. the ends of a duplex (36). The standard
deviation in the slope (−0.114 kcal/mol) is 0.033 kcal/mol. Eq.
7 predicts the Δ*G*°_{37} of
26 oligonucleotide duplexes with fewer than 17 bp with a standard
deviation of 0.60 kcal/mol. Eq. 7 gives
Δ*G*/ln[Na^{+}] = −0.114 kcal/mol,
which corresponds to a
*T*_{M}/log[Na^{+}] of 11.7°C
(assuming a sequence independent Δ*S*° of −22.4 e.u. per
base pair (see below)); this agrees with previously observed values for
oligonucleotides (29, 30). The entropy correction is given by:

If the Δ*H*° is assumed to be
salt-concentration-independent (30, 36), the *T*_{M}
values of the 26 oligonucleotides are predicted by using Eqs.
8 and 3 with an average deviation of 2.2°C. The
salt corrections given in Eqs. 7 and 8 can be
viewed as either length-dependent corrections to the initiation
parameter or as corrections to the propagation parameters because there
are *N* NNs in an oligonucleotide duplex.

#### Salt Dependence of Polymers.

Helix formation in polymers
does not formally involve an initiation parameter so the salt
dependence is by default incorporated into the NN propagation terms
(35). The observation that polymer NN parameters and the unified
oligonucleotide NN are highly correlated (Table (Table3)3) suggests a
relationship could be determined that would allow prediction of polymer
behavior from the unified NN parameters with an appropriate salt
correction. The Δ*G*°_{37} differences of
the three polymer NN data sets (Table (Table1)1) and the unified NN (1 M NaCl)
data set were plotted vs. the ln[Na^{+}] of the polymer
data. From a least squares fit of this plot (30 data points) and the
assumption that the salt correction is sequence independent, the
following empirical equation was derived:

The standard deviations in the slope (−0.175 kcal/mol)
and intercept (−0.20 kcal/mol) are 0.034 kcal/mol and 0.11
kcal/mol, respectively. Note that this correction is given in
kcal/mol of base pairs. Alternatively, the
Δ*G*°_{37} of each polymer NN at three
salt concentrations can be individually plotted vs.
ln[Na^{+}] to test the sequence dependence of salt effects.
Unfortunately, the
Δ*G*°_{37}(*i*) vs.
ln[Na^{+}] plots for the CT/GA, CG/GC, and TA/AT
neighbors show correlation coefficients *R*^{2} that
are less than 0.9. Nonetheless, for the seven NN that show a linear
salt dependence of Δ*G*°_{37}
(*R*^{2} > 0.95), it does appear that A+T-rich NNs
show a larger salt dependence than the G+C-rich NNs, consistent with
earlier observations (see Eq. 5a) (35). Blake (37) has
provided a tentative salt dependence of the 10 dimers.

Eq. 9 gives
Δ*G*/ln[Na^{+}] = −0.175 kcal/mol,
which corresponds to a
*T*_{M}/log[Na^{+}] of 16.2°C
(when a sequence-independent Δ*S*° of −24.85 e.u. is
assumed), which agrees well with the widely used value for polymers of
16.6°C (38). Fig. Fig.22 plots the
NN stabilities observed in the three polymer studies (Table (Table1)1) (7, 10,
17) versus those predicted with Eq. 9. The slope and
intercept are close to 1 and 0, respectively, and the correlation
coefficient *R*^{2} is 0.96 (see Fig. Fig.2).2). The
standard deviation between the predictions with Eq. 9 and
the experimental data is 0.12 kcal/mol (the eight polymer invariants
are predicted within 0.09 kcal/mol). For comparison, when the polymer
parameters of the three groups (7, 10, 17) are compared with each
other, the least-squares fit produces standard deviations of 0.05, 0.9,
and 0.9 kcal/mol for the comparisons of data from ref. 10 vs. ref.
17, ref. 10 vs. ref. 7, and ref. 17 vs. ref. 7, respectively. The
experimental error reported for the unified parameters is ~0.05
kcal/mol (22). Thus, the unified NN used with Eq. 9
provides predictions of polymers within experimental error of those
obtained with the polymer parameters. A test of the validity of Eq.
9 is to use it to actually predict the stability of
polymers. A plot of the experimental
Δ*G*°_{37} for 27 different synthetic
polymers dissolved in solutions ranging from 0.01 M to 0.20 M
Na^{+} (16, 39) vs. those predicted with Eq. 9
gives a linear least-squares regression line of *y* =
1.043 × −0.040 with *R*^{2} = 0.894 (data not
shown). The standard error between experiment and prediction is 0.14
kcal/mol. This level of agreement suggests the unified NN parameters
accurately reflect the polymer NN trends.

It is interesting to compare the salt dependence for oligonucleotide
duplexes and for polymers given by Eqs. 7 and 9.
The slope of the ln[Na^{+}] term is 54% larger for polymers
than for oligonucleotides (0.175 vs. 0.114, respectively). When
extrapolated to 1 M Na^{+}, the polymer NN are more stable
than oligomer NN by −0.20 kcal/mol. Qualitatively, these differences
in salt dependence can be viewed simply as arising either from “end
effects” present in oligonucleotides but not in polymers (36, 40) or
from “polymer counterion condensation effects” (24, 25) that are
reduced in oligonucleotides. On the basis of these data, there does
appear to be a “length dependency” to the salt behavior of
nucleic acids that is not yet completely understood (18, 36).

#### Sequence Dependence of the Propagation Entropy.

The
polymer and dumbbell studies assume that the propagation entropy
change, Δ*S*°(*i*) is independent of sequence and
salt concentration (7, 10, 17, 18). The most reliable estimate for
polymers is −24.85 ± 1.74 e.u. in 0.075 M NaCl (17). The
sequence-independent Δ*S*°(*i*) assumption
introduces error into Δ*H*° and
Δ*G*°_{37} via Eqs. 4 and
2 of approximately 0.6 kcal/mol and 0.06 kcal/mol,
respectively. The unified oligonucleotide
Δ*S*°(*i*) in 1 M NaCl range from −19.9 to
−27.2 e.u. (22) with an average of −22.4 e.u. and standard deviation
of 2.1 e.u. The use of the polymer
Δ*S*°(*i*) of −24.85 e.u. is not appropriate for
predictions of oligonucleotides (>20% error in Δ*S*°
predictions). However, the idea that the
Δ*S*°(*i*) is sequence independent is nearly
correct for DNA. A sequence independent
Δ*S*°(*i*) of −22.4 e.u. predicts the
Δ*S*° values of the unified oligonucleotide data set with
an average deviation of 9.4%. This is close to the predictive capacity
of the unified NN parameters themselves, which predict the unified data
set with an average deviation of 8.4%.

#### Analysis of Gotoh and Tagashira (7).

Gotoh and Tagashira
(7) measured the UV thermal denaturation curves of 11 DNA restriction
fragments dissolved in 0.0195 M Na^{+}. The curves were fit
with the Poland partition function algorithm (32) using the
Fixman–Freire approximation for the loop functions (33) and modified
to incorporate heterogeneous stacking (7). Vologodskii *et
al.* (10) critically evaluated the work of Gotoh and Tagashira (7)
and concluded that the low salt concentration was responsible for the
observed hysteresis and suggested that this indicated nonequilibrium
conditions. This work, however, shows that with the proper salt
extrapolation, the parameters of Gotoh and Tagashira (7) are in
remarkable agreement with other polymer parameters (10, 17) and with
the unified oligonucleotide NN parameters (Fig. (Fig.2).2). This suggests any
nonequilibrium effects in Gotoh and Tagashira’s study must have been
relatively small.

#### Analysis of Vologodskii *et al.*

(10).
Vologodskii *et al.* (10) derived NN parameters by using the
linearly independent sequences approach from eight natural DNA polymer
restriction fragments dissolved in 0.195 M Na^{+}. The high
salt concentration used ensured equilibrium conditions throughout the
melting curve. Vologodskii’s study used a partition function approach
similar to that in the study of Gotoh and Tagashira (7). Vologodskii
*et al.* (10) presented their NN parameters as both eight
linearly independent sequences and 10 nonunique dimer parameters by
using the assumptions that AT/TA = TA/AT and GC/CG =
CG/GC. Oligonucleotide experiments reveal that these assumptions are
approximately correct for DNA but not for RNA (13). The Vologodskii
Δ*G*°_{37} NN parameters show remarkable
agreement with the unified NN parameters (Tables (Tables11 and and3).3). The results
presented herein verify that the experimental design and analysis
methods used in ref. 10 are fundamentally sound.

#### Analysis of Breslauer *et al.* (12).

Breslauer
*et al.* (12) derived NN thermodynamic parameters by using
differential scanning calorimetry and UV melting analysis of 19
oligonucleotide duplexes (dissolved in 1 M NaCl) and nine synthetic DNA
polymers (dissolved in low salt with results extrapolated to 1 M
Na^{+}). It is not possible to rederive the reported
parameters, however, because much of the primary thermodynamic data
have not been published. This work demonstrated good insight in that
the authors reasoned that polymer and oligomer NN trends should be
similar. However, the assumption that the initiation
Δ*G*°_{37} is 5.2 kcal/mol is most
likely what led to the incorrect NN determined (Tables (Tables11 and and3).
3).
Breslauer’s NN predict the Δ*G*°_{37},
Δ*H*°, Δ*S*°, and *T*_{M}
of the unified data set with average deviations of 16.7%, 10.1%,
10.6%, and 6.0°C, respectively. Predictions are particularly poor
for oligonucleotides shorter than 8 bp. For example, the
*T*_{M} of the sequence CACAGCTGTG (41) is
incorrectly predicted by 31°C. Other groups have also been unable to
reconcile the Breslauer parameters with experiments (18, 20, 21, 37,
42).

#### Analysis of Delcourt and Blake (17).

Delcourt and Blake (17)
studied 41 restriction fragments of natural polymers dissolved in 0.075
M Na^{+} and expressed their NN parameters in terms of 10
nonunique dimers that make good predictions of polymers but do not
represent the real trends in NN stability. The results presented herein
verify that the experimental design and analysis methods used in
Delcourt and Blake (17) are fundamentally sound.

#### Analysis of Doktycz *et al.* (18).

Benight and coworkers (18) recognized that there was consensus agreement among the polymer studies and their own dumbbell studies but could not reconcile the oligomer literature parameters (12). On the basis of this problem, the authors proposed that there must be a “length dependency” to DNA NN thermodynamics. Herein I show that the NN parameters themselves are not “length-dependent” but that the salt dependence is length-dependent in ways that are still not fully understood.

Benight and coworkers (18) used UV melting analysis of 17
oligonucleotide dumbbells with 14–18 bp to determine nine linearly
independent sequences that follow a NN model. The experimental design
for this study precluded measurement of an initiation parameter for
duplex formation and a 10th NN parameter. The analysis was performed
under four salt conditions, including 25, 55, 85, and 115 mM
Na^{+}. These data suggested that the NN model breaks down at
salt concentrations of less than 85 mM but works well at 115 mM
Na^{+}. It is possible that the neglect of the length
dependence of salt effects (Eq. 7) is what led to the
apparent breakdown of the NN model at low salt concentrations. Doktycz
*et al.* (18) assumed that the Δ*S*° for helix
propagation was sequence- and salt-independent (−24.8 e.u. per bp)
(17), which is incorrect for oligonucleotides. With the exception of
the CG/GC neighbor, the dumbbell
Δ*G*°_{37} NN parameters in 0.115 M
Na^{+} show good agreement with the unified NN parameters
(Table (Table11).

#### Analysis of SantaLucia *et al.* (20).

SantaLucia
*et al.* (20) derived NN parameters from thermodynamics
determined by a van’t Hoff analysis of UV melting data for 23
oligonucleotides combined with calorimetric or UV melting results from
the literature for 21 other sequences. To minimize “fraying
artifacts”, all sequences included in the linear regression analysis
to determine NN parameters had terminal GC pairs (12). The
SantaLucia parameters are within experimental error of the unified
parameters in Table Table11.

Data were available in the literature for eight sequences with terminal
TA pairs. These data were included in the first fit of the NN
parameters and the stacking matrix was not rank deficient but
nonphysical results were obtained for six of the nearest neighbors
(AT/TA, TA/AT, CA/GT, AC/TG, GA/CT, and AG/TC) (20). For
example, the Δ*H*° parameters for AT/TA and TA/AT
neighbors were found to be −10.80 and +1.16 kcal/mol, respectively,
which is unlikely (J.S., unpublished results). We now know that four of
the sequences with terminal TA pairs exhibited non-two-state
behavior (20, 21). Upon removal of the sequences with terminal TA
pairs, more reasonable results were obtained, but the rank of the
stacking matrix was reduced to 10 (nine linearly independent sequences
plus one initiation parameter) (23).

#### Analysis of Sugimoto *et al.* (21).

Sugimoto
*et al.* (21) derived NN parameters from thermodynamics
determined by a van’t Hoff analysis of UV melting data for 50
oligonucleotides combined with data for 15 sequences from other
laboratories obtained by both calorimetry and UV melting. Except for
the initiation Δ*G*°_{37}, the CG/GC
Δ*G*°_{37}, and the GG/CC
Δ*H*°, the Sugimoto parameters (21) are in good agreement
with the unified NN parameters (Tables (Tables11 and and3).3). With the proper linear
regression analysis, Sugimoto’s data set provides NN and initiation
parameters that are in excellent agreement with the unified parameters
(22). Important results of this work are that separate parameters for
terminal TA base pairs and for initiation at AT are not
required.

### Conclusion

A unified set of NN parameters is now available for making accurate predictions of DNA oligonucleotide, dumbbell, and polymer thermodynamics. The agreement among the various polymer and oligomer studies provides a great deal of confidence in their reliability.

## Acknowledgments

I thank Douglas H. Turner and Hatim Allawi for stimulating conversations and for critical reading of the manuscript. I also thank Wayne State University and Hitachi Chemical Research for financial support.

## ABBREVIATIONS

- SVD
- singular value decomposition
- NN
- nearest neighbor
- e.u.
- entropy unit (cal/Kmol)

## References

**National Academy of Sciences**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (201K) |
- Citation

- Thermodynamic analysis of stacking hybridization of oligonucleotides with DNA template.[J Biomol Struct Dyn. 2001]
*Pyshnyi DV, Pyshnaya I, Levina A, Goldberg E, Zarytova V, Knorre D, Ivanova E.**J Biomol Struct Dyn. 2001 Dec; 19(3):555-70.* - Heat capacity effects on the melting of DNA. 2. Analysis of nearest-neighbor base pair effects.[Biophys J. 1999]
*Rouzina I, Bloomfield VA.**Biophys J. 1999 Dec; 77(6):3252-5.* - Terminal contributions for duplex oligonucleotide thermodynamic properties in the context of nearest neighbor models.[Biopolymers. 2011]
*Guerra JC, Licínio P.**Biopolymers. 2011 Mar; 95(3):194-201.* - Studies of DNA dumbbells VII: evaluation of the next-nearest-neighbor sequence-dependent interactions in duplex DNA.[Biopolymers. 1999]
*Owczarzy R, Vallone PM, Goldstein RF, Benight AS.**Biopolymers. 1999; 52(1):29-56.* - Effect of the number of nucleic acid oligomer charges on the salt dependence of stability (DeltaG 37degrees) and melting temperature (Tm): NLPB analysis of experimental data.[Biochemistry. 2004]
*Shkel IA, Record MT Jr.**Biochemistry. 2004 Jun 8; 43(22):7090-101.*

- DNASynth: A Computer Program for Assembly of Artificial Gene Parts in Decreasing Temperature[BioMed Research International. 2015]
*Nowak RM, Wojtowicz-Krawiec A, Plucienniczak A.**BioMed Research International. 2015; 2015413262* - Thermodynamic Basis For Engineering High Affinity, High Specificity Binding-Induced DNA Clamp Nanoswitches[ACS nano. 2013]
*Idili A, Plaxco KW, Vallée-Bélisle A, Ricci F.**ACS nano. 2013 Dec 23; 7(12)10863-10869* - Linkage mapping in the oilseed crop Jatropha curcas L. reveals a locus controlling the biosynthesis of phorbol esters which cause seed toxicity[Plant Biotechnology Journal. 2013]
*King AJ, Montes LR, Clarke JG, Affleck J, Li Y, Witsenboer H, van der Vossen E, van der Linde P, Tripathi Y, Tavares E, Shukla P, Rajasekaran T, van Loo EN, Graham IA.**Plant Biotechnology Journal. 2013 Oct; 11(8)986-996* - ptRNApred: computational identification and classification of post-transcriptional RNA[Nucleic Acids Research. 2014]
*Gupta Y, Witte M, Möller S, Ludwig RJ, Restle T, Zillikens D, Ibrahim SM.**Nucleic Acids Research. 2014 Dec 16; 42(22)e167* - Molecular diagnostics on the toxigenic potential of Fusarium spp. plant pathogens[Journal of Applied Microbiology. 2014]
*Dawidziuk A, Koczyk G, Popiel D, Kaczmarek J, Buśko M.**Journal of Applied Microbiology. 2014 Jun; 116(6)1607-1620*

- A unified view of polymer, dumbbell, and oligonucleotide DNA
nearest-neighborth...A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighborthermodynamicsProceedings of the National Academy of Sciences of the United States of America. 1998 Feb 17; 95(4)1460

Your browsing activity is empty.

Activity recording is turned off.

See more...