Genetic information in higher organisms is encoded in centimeters-long DNA molecules, which are compacted into chromosomes in the cell nucleus through a hierarchical series of folding steps. In the first level of compaction, short stretches of (negatively charged) DNA (147bp, 50 nm) are wrapped locally in ~2 turns around (positively charged) 3.5 nm radius protein spools, forming *nucleosomes* (see , top inset) Nucleosomes are separated along the DNA by short stretches of unwrapped “linker” DNA, typically about 10-50bp in length. Thus, about 75-90% of each chromosomal DNA molecule is wrapped in nucleosomes. DNA that is wrapped in nucleosomes is less-easily accessible to many gene regulatory proteins, therefore the detailed locations of nucleosomes along the DNA have important biological consequences.

Temperature dependence of the overlap parameter Ψ with temperature in units of the reference temperature. Short dash: μ = 62 with zero temperature occupancy of 89.6%. Solid line: μ = 24 with zero temperature occupancy of 83.3%. **...**

In an early model, Kornberg and Stryer^{i} (KS) treated the nucleosomes as a uniform *one-dimensional*(1D)*liquid of hard rods* with an excluded volume of the order of 147 bp's. They attributed regularities in nucleosome positioning observed *in vivo* to the decaying density oscillations near a boundary that are characteristic for the 1D liquid of hard rods at higher densities^{ii}. The “boundaries” would be provided here, for example, by the site-specific binding of Transcription Factors (TF's), that, when bound to DNA, prevent that stretch of DNA from being wrapped in a nucleosome. Recently, Segal et al.^{iii} showed that nucleosomes in fact prefer certain DNA sequences over others, and that genomes utilize these sequence preferences to *bias* the preferred locations of nucleosomes. This sequence specificity of DNA/nucleosome binding is due to the sequence-dependent bending stiffness of DNA itself. Certain dinucleotides (one base pair followed by another) greatly enhance the ability of DNA to sharply bend in particular directions relative to the DNA helix axis, as required by the nucleosome. This sharp bending occurs every DNA helical repeat (~10 basepairs), when the major groove of the DNA faces inwards towards the center of curvature, and again, about 5 basepairs out of phase, and with opposite direction, when the major groove faces outward. Bends of each direction are facilitated by specific dinucleotides. Segal et al.^{iii} used an alignment of natural nucleosome DNA sequences to create a *statistical profile* (a position-dependent Markov model) that represented the nucleosome's dinucleotide sequence preferences at each of the basepair steps within the nucleosome. The resulting Markov model assigns a log-likelihood for an isolated nucleosome starting at a given basepair in the genome, which amounts to an effective *on-site potential* for nucleosomes. They then used a dynamic programming algorithm to exactly solve the equilibrium distribution of hard rods on this potential landscape and the resulting equilibrium density profile of this 1D statistical mechanics model correctly predicted about half of the actual stable nucleosome positions of the yeast genome.

In many organisms, cells of different types or developmental states - all of which have the same genomic DNA sequences - organize their DNA into arrays of nucleosomes at quite different nucleosome densities (concentrations). For example, rat neuronal cells have an average nucleosome spacing of ~165 bp (~18 bp average linker DNA length) at birth, while 30 days later the same cells have increased the spacing to ~218 bp (~71 bp average linker DNA) ^{iv}. The chromatin repeat-length of brain cortex and cerebellar neurons changes concomitant with terminal differentiation. If slight changes in average density or nucleosome binding strength, would alter the nucleosome density profile in a drastic and chaotic manner, then the predicted nucleosome positions would not be “robust” and any functionality for nucleosome positioning would be in serious doubt. The aim of the present paper is to examine the *thermodynamic stability* of nucleosome positioning using the model of Segal et al.^{iii}.

We work in the grand-canonical ensemble with the *N*-nucleosome Hamiltonian:

Here, *n*_{i} is the location of the first base pair blocked by the i'th nucleosome. The nucleosome-nucleosome interaction potential *U(m)* is equal to zero if *m* exceeds the hard-core size *a* =157 ^{v} and infinite if *m* is less than or equal to *a*. Next, *V*_{n} -*k*_{B}*T*_{0} log *P*_{n} is the nucleosome on-site potential energy with *T*_{0} the *reference temperature *^{vi} - the characteristic energy scale of the on-site potential - and *P*_{n} the likelihood for nucleosome binding at site *n* as obtained from the Markov model analysis for the DNA sequence of yeast chromosme II ^{vii}. The probability ρ_{n} for site *n* to be the first site of a stretch of *a* sites occupied by a nucleosome is obtained from a discrete version of an exact recursion relation derived by Percus describing the statistical mechanics of a liquid of hard rods in an external potential ^{viii}. This recursion relation,

with ${H}_{n}={\prod}_{m=2}^{a}(1-{h}_{n+m-1})$, is expressed in terms of the set of quantities *h*_{n} that we obtain

by numerical iteration of Eq.(1.2) from large to small *n*. The site probability ρ_{n} then follows from the relation ${h}_{i}={\rho}_{i}{(1-{\sum}_{j=1}^{a-1}{\rho}_{i-j})}^{-1}$ while the thermodynamic properties follow from the grand partition function $\Xi (\mu ,T)\propto {\prod}_{n}\left(\frac{1}{1-{h}_{n}}\right)$. Note that Eq. (1.2) reduces to the classical Langmuir Isotherm ${\rho}_{n}=\frac{{e}^{\beta (\mu -{V}_{n})}}{1+{e}^{\beta (\mu -{V}_{n})}}$ in the absence of the excluded volume interactions.

The level of correlation between the nucleosome site probabilities and the position dependence of the site potential *V*_{n} is characterized by the *overlap parameter* ψ:

Here, V is the mean on-site probability, L is the system size, and *V*_{min} is the site potential with the largest (i.e. most attractive) binding energy of the sequence. If all nucleosomes are located at optimal binding sites, where *V*_{n} equals *V*_{min}, then ψ equals one while ψ is zero if there are no correlations between site probability and site potential, which occurs as *T* / *T*_{0} → ∞. shows the dependence of the overlap parameter on the ratio *T/T*_{0} of the ambient and reference temperatures for different values of the chemical potential that correspond to densities in the biologically relevant range of 50% to 90% occupancy. In all cases, the overlap parameter decreases towards zero for *T/T*_{0} large compared to one. In this regime, the site occupation probability shows -approximately - sinusoidal density modulations with period equal to the hard rod length (right inset) that are related to the KS density oscillations ^{i}. The overlap parameter grows as *T/T*_{0} decreases and then saturates for *T/T*_{0} small compared to one. Note, from , that the value of ψ for small *T/T*_{0} increases with decreasing density. Indeed, in the *close-packing limit* (i.e., 100% occupancy) the site occupancy probability must be constant, which - according to Eq.2 - means that the overlap parameter is zero. Nevertheless, ψ remains surprisingly large even at 90% occupancy. A typical occupancy plot in this regime, shown in the left inset of , clearly indicates the preferred locations for the nucleosomes. It is only in this low-temperature regime that precise nucleosome locations can be predicted. The change between the two regimes, for *T/T*_{0} near one, can be viewed as a *freezing transition*, but does not provide evidence for a true thermodynamic phase transition^{ix}.

Site occupancy of an unstable section (bp 6,500-9,500) at T/T_{0} = 0.5 for different values of the chemical potential. The mean occupancies are 88.8, 89.3, and 90%. For higher, respectively, lower chemical potentials, the occupation pattern is stable with, **...**

Surprisingly, the occupancy profile for *T/T*_{0} = 0.5 contains *both* stable and unstable sequences. Most sequences exhibit well-defined density profiles of the form shown in that allow an unambiguous assignment of nucleosome positions and that are stable with respect to small changes in μ. These stable sequences are interspersed by shorter sequences having lengths of the order of 1,000 bp and poorly defined density profiles. The fraction of unstable sequences is about 15% at μ = 24 for *T/T*_{0} = 0.5 while it decreases to zero in the limit of small *T/T*_{0}. The middle panel of shows a typical example of such an unstable sequence with a length of 3,000 bp. If one slightly increases the chemical potential (from 89.3% to 90% mean occupancy) then the occupancy profile of this section evolves into a well-defined nucleosome configuration (with *N* = 12 nucleosomes, , top panel). Similarly, after a very slight decrease in mean occupancy, to 88.8%, the section again transforms to a well-defined configuration (with *N* = 11 nucleosomes, , bottom panel).

The physical meaning of the `disordered' occupancy profile of the middle panel becomes clear if one notes that it simply is the *superposition* of the stable *N* = 11 and *N* = 12 configurations. This suggests that the transition from the *N* = 11 to the *N* = 12 configuration as a function of the chemical potential can be viewed as a “micro” first-order phase transition with the free energies of the N = 11 and N=12 states degenerate at the transition point. The middle panel would represent a form of finite-size *phase coexistence*. Indeed, plots of the mean number of nucleosomes <N> of the section - which can be viewed as an order parameter - as a function of the chemical potential μ for different values of *T/T*_{0} (see ) are closely similar to M-H magnetization isotherms of a 1D Ising model with transition temperature *T*_{0}. For *T/T*_{0} small compared to one, a well-defined, rapid switch takes place between the N = 11 and N = 12 states as a function of μ. We also performed Monte Carlo simulations^{x} to determine the distribution of ψ at μ / *k*_{B}*T*_{0} ≈ 47.9, where the N=11 and N=12 states are degenerate at low temperatures (see (inset)). Indeed, we find a broad distribution at high temperatures and a sharply peaked, bimodal distribution at low temperatures, the hallmark of a first order transition. The appearance of micro first-order transitions is *not* an accidental specific of the DNA of yeast chromosome II. Indeed, if we replace *V*_{n} with a Gaussian random site potential having the same RMS width as the *V*_{n} then the fraction of unstable sequences remains comparable to that of yeast chromosome II. The transitions are in fact a *generic* consequence of the unavoidable competition at higher occupancy levels between arrangements with different density but the same free energy.

Average number of nucleosomes between sites 6970 and 8730 of yeast chromosome II as a function of chemical potential. Solid line: *T* / *T*_{0} = .1 Short dash: *T* / *T*_{0} = .5. Long dash: *T* / *T*_{0} = 1. Inset: Histogram of the overlap (order) parameter Ψ in **...**

For lower values of *T/T*_{0}, the nucleosome conformation in the switching regions is thus exquisitely sensitive to small changes in the chemical potential and the characteristic energy scale *k*_{B}*T*_{0} of the DNA-nucleosome interaction. Cells of different types or different developmental states with different nucleosome densities are known to be characterized by chemical modification of the nucleosomes (methylation or acetylation) that would alter this characteristic energy scale. The model thus predicts that changes in nucleosome density in response to chemical modification will be localized to the switching regions. A corollary of this prediction is the possibility that the location of the switching regions on the chromosome correlates with the location of gene-regulatory sequences, specifically the binding sites of Transcription Factors. To test this hypothesis, we collected 278 published TF binding locations on chromosome two from the SGD database. We equilibrated the system at ~75% and ~90% average occupancies and calculated the absolute value of the difference in the probability of site occupancy. The mean (absolute) change was found to be 22.7%. When we calculated the mean absolute change in occupancy on the restricted set of the 278 TF binding locations, we found it to be 32.9%. To check for the statistical significance of this result, we randomly chose 250 different sets of 10 base pairs regions and calculated the average absolute change in site occupancy on changing the density. We repeated this procedure and generated a distribution of average absolute changes and found a standard deviation of 2% from the mean. This means that the statistical probability of randomly achieving a mean value of 32.9% is less than 10^{-7}. It follows that at least *some* TF binding sites are strategically placed on segments of DNA that on average are more likely to reconfigure to changes in nucleosome concentration or affinity.

In summary, we have shown that a simple 1D model for nucleosome positioning, that correctly predicts 50% of the nucleosome locations along yeast chromosome II, is characterized by a sequence of highly localized, first-order rearrangement transitions that occur as a function of the characteristic interaction energy between the DNA and nucleosomes. The transitions are a generic consequence of frustration between the requirements of the excluded-volume interactions and the on-site potential energy. The location of the first-order rearrangement transitions correlates with the location of TF binding sites, indicating that the frustration is exploited by nature as gene-regulatory switches that distinguish cells in different states of development.