Logo of narLink to Publisher's site
Nucleic Acids Res. 2008 Jun; 36(11): 3570–3578.
Published online 2008 May 3. doi:  10.1093/nar/gkn173
PMCID: PMC2441786

Spatial effects on the speed and reliability of protein–DNA search


Strong experimental and theoretical evidence shows that transcription factors (TFs) and other specific DNA-binding proteins find their sites using a two-mode search: alternating between three-dimensional (3D) diffusion through the cell and one-dimensional (1D) sliding along the DNA. We show that, due to the 1D component of the search process, the search time of a TF can depend on the initial position of the TF. We formalize this effect by discriminating between two types of searches: global and local. Using analytical calculations and simulations, we estimate how close a TF and binding site need to be to make a local search likely. We then use our model to interpret the wide range of experimental measurements of this parameter. We also show that local and global searches differ significantly in average search time and the variability of search time. These results lead to a number of biological implications, including suggestions of how prokaryotes achieve rapid gene regulation and the relationship between the search mechanism and noise in gene expression. Lastly, we propose a number of experiments to verify the existence and quantify the extent of spatial effects on the TF search process in prokaryotes.


Protein–DNA interactions are vitally important for every cell. Transcription factors (TFs) are proteins that interact with specific DNA sequences to regulate gene expression. The targeting of TFs to their sites is a passive process; therefore, it seems natural to assume that TFs simply diffuse through the nucleus (in eukaryotes) or cell (in prokaryotes) until they find their sites.

In the 1970s, this assumption was challenged by the observation that, in vitro, the prokaryotic TF LacI is able to find its binding site 100 times faster than expected by three-dimensional (3D) diffusion in the solvent (1). This led to the suggestion of a ‘facilitated diffusion’ mechanism in which TFs alternate between 3D diffusion, jumping, through the volume of the cell and one-dimensional (1D) sliding along the DNA to rapidly locate their binding sites (2–4). This hypothesis was corroborated by several pieces of evidence—most strikingly several single molecule studies in which the authors visualized individual proteins sliding along DNA (5–7). Several groups have also mathematically modeled this process and shown it to be a plausible way of making the search significantly faster than 3D diffusion alone (3,4,8–11).

Several aspects of facilitated diffusion, however, remain puzzling, e.g. the effect of the DNA sequence composition and conformational transitions in the protein on the rate of sliding (10,12) and role of the DNA conformation (11). Here we consider how spatial effects influence the search process. Specifically, we ask whether and how search time depends on the initial distance between the protein and the target site.

The distance dependence of the TF search process has not been considered before because the rate of a bimolecular reaction in 3D is distance-independent (13). Therefore, the time it takes for a protein diffusing in 3D to find its target does not depend on the initial distance between the two, as long as this distance is greater than the size of the target. In contrast, the time of search in two dimensions (2D) (e.g. on a membrane) or in 1D (e.g. along DNA or along a filament) is distance-dependent (13). Therefore, we ask: can the 1D component of facilitated diffusion make search much faster for a protein that starts a small distance from its target site?

Here we use simulations and analytical estimates to demonstrate that TF search time indeed depends on the initial position of the TF with respect to its binding site. We show that the trajectories can be naturally separated into fast local and slow global searches (Figure 1A). We find that if a TF starts sufficiently close—less than 1000 base pairs (bp) for our model organism Escherichia coli—to its binding site, a local search is likely.

Figure 1.
(A) We defined two types of searches: local searches in which the TF finds its binding site quickly using only hops and slides, and global searches in which the TF finds its binding site using hops, jumps and slides. In this illustration, the black oval ...

While studying how spatial effects contribute to the search process, we observe that upon dissociation from DNA, a protein is likely to quickly re-associate near is dissociation point, thus making a short-range hop, rather than a long-range jump (Figure 1B). We examine how these two types of spatial excursions influence the search process, allowing us to reconcile the widely ranging experimental measurements of the sliding length (6,7,14,15).

Finally, we show that the strong non-specific binding of TFs to DNA makes global search rather slow, thus making local search appreciably faster. Moreover, local searches have significantly smaller variance in the search time, making them an attractive mechanism to deliver DNA-binding proteins to their targets quickly and reliably.

There are a number of biological implications of these spatial effects. Since transcription and translation are coupled in bacteria, proteins are produced near the location of their genes. Therefore, TFs whose genes are co-localized with their binding sites are likely to use a local search mechanism. The efficiency of local search provides a physical justification for the observed co-localization of TF genes and their binding sites in prokaryotic genomes (16–18). We also propose a number of experiments to test the mechanism and its predictions.


Characterizing hops using simulations

To include hops in the search model, we needed to estimate the relative frequency of hops and jumps and the displacement due to hops. Assuming that DNA could be treated as straight rods on the length scale of a hop, we considered the problem in a cylindrical geometry and simplified it further to a 2D geometry (Figure 2A). In the 2D cross-section, DNA strands are represented as absorbing circles. To simulate diffusion in 2D, we discretized the cross section into a 1 µm2 square lattice with 1 nm spacing and randomly distributed DNA strands, each with an absorbing radius of 2 nm. We simulated a TF trajectory as a random walk on the lattice, starting from its dissociation from DNA and ending with its association to DNA. Trajectories that started and ended on the same DNA strand were called hops; otherwise they were jumps (Figure 2A). From these trajectories, we calculated the probability of a hop as a function of the number DNA strands in the lattice (Figure 2B).

Figure 2.
(A) DNA exists in a compacted form in vivo, as illustrated on the top. To model the relative frequency and properties of hops and jumps, we looked at a 2D cross section of the DNA, imagining the DNA strands to be approximately straight rods on the short ...

Using the length of the hop trajectories, we also calculated the displacement along the DNA strand during a hop for lattices with 1500 strands, the approximate density of DNA in E. coli. We assumed that, in the 3D geometry, two-thirds of the random walk steps were in the 2D plane and one-third were in the z-direction—along the DNA. Therefore, given the length of the hop trajectory in 2D, we drew the number of 1 nm steps along the DNA strand, z, from the negative binomial probability distribution function

equation image

where y is the number of 1 nm steps in the 2D cross section. To calculate the net displacement along the DNA strand resulting from a 1D random walk with z 1 nm steps, we drew the displacement (in nm) c from the probability distribution function An external file that holds a picture, illustration, etc.
Object name is gkn173i1.jpg, the normal distribution with a mean of 0 and a variance of z. As can be seen in Figure 2C, the median absolute displacement resulting from these hops is approximately 1 bp, which is much smaller than persistence length of DNA, the length scale on which DNA is approximately straight, 150 bp. This justified the use of a 2D projection of 3D DNA, since, on the length scale of a hop, DNA is approximately straight. For each DNA density, a total of 106 random walks were simulated, 1000 lattices and 1000 walks per lattice.

Simulating transcription factor searches

To simulate the search process in its entirety, we first created a DNA strand M bp long and randomly selected one site to be the binding site. The TF started d bp away from the binding site. (See Tables 1 and 2 for parameter definitions and values.) The TF then alternated between 1D slides and 3D moves (hops or jumps). The slides were modeled as 1D random walks in which the TF could take a 1 bp step to the left or right or dissociate with a probability An external file that holds a picture, illustration, etc.
Object name is gkn173i2.jpg, where s is average number of base pairs scanned during a slide. At the end of each slide, the TF hopped with probability phop = 0.8325 (derived from lattice simulations with 1500 DNA strands), otherwise it jumped. Hops were simulated using the empirical distribution shown in Figure 2C, and jumps were simulated by picking a random association point. When the TF landed on the binding site, the search was terminated.

Table 1.
Model variables and functions
Table 2.
Model parameters and estimates

For the simulation-based estimates of plocal, the probability that the TF finds its binding site using only slides and hops, but no jumps, we simulated 1000 runs for each combination of d and s, using values of s corresponding to An external file that holds a picture, illustration, etc.
Object name is gkn173i9.jpg = 10–6, 10–5, 10–4 and 10–3 M, and values of d between 0 and 3000 bp. (See Table 2 for details of the relationship of An external file that holds a picture, illustration, etc.
Object name is gkn173i10.jpg, the equilibrium dissociation constant of a TF and piece of non-specific DNA, and s.) For the simulation-based estimates of search time presented in Figures 3B and 3C, we used An external file that holds a picture, illustration, etc.
Object name is gkn173i11.jpg = 10–5 M and 5000 runs for each d. To find the average search time for n TFs, we simulated runs in groups of n, took the minimum search time of the group, and averaged this over all groups.

Figure 3.
(A) The probability of a local search depends on the effective sliding length, se, of the TF and the initial distance between the TF and its binding site d. Here we show the relationship for several values of An external file that holds a picture, illustration, etc.
Object name is gkn173i21.jpg = 10−6, 10−5, 10−4 ...


Why and how is the transcription factor search distance-dependent?

As Polya purportedly told the drunkard wandering the streets looking for his home, ‘You can't miss; just keep walking, and stay out of 3D!’ In 3D, diffusion is non-redundant, i.e. the probability of revisiting a particular site is less than one (13). As a consequence of this property, the average time to find a particular site does not depend on initial position. Conversely, in 1D, diffusion is highly redundant and search time strongly depends on initial position.

In the TF search process, the search time becomes independent of initial position as soon as the TF diffuses in 3D. Therefore, in previous models (3,8–11), the calculated mean search time (ts) is independent of initial position. The search time is presented in different forms, but all are approximately equivalent to the average number of rounds of 1D and 3D diffusion multiplied by the average time of each round:

equation image

(See Supplementary Data for details.) Here M is the genome length in bp, s is the average number of bp scanned in one slide, τ1D is the average duration of one slide, and τ3D is the average duration of jump. (See Tables 1 and 2 for variable and parameter definitions.)

However, if a TF can find its site by sliding along the DNA and not jumping (i.e. ‘staying out the 3D’), in what we call a local search, the search time, An external file that holds a picture, illustration, etc.
Object name is gkn173i12.jpg, will be dependent on its initial position (Figure 1A). Otherwise, assuming that a jump brings the protein to a random location of the DNA, the search will be global, i.e. the TF forgets its initial location and must sample the entire DNA molecule to find its site. In this case, the mean search time, An external file that holds a picture, illustration, etc.
Object name is gkn173i13.jpg, will be given by Equation (1).

Therefore, the mean search time for a TF starting at a distance d from its binding site is an average of An external file that holds a picture, illustration, etc.
Object name is gkn173i14.jpg and An external file that holds a picture, illustration, etc.
Object name is gkn173i15.jpg, weighted by the probability that a TF will find its site via a local search, plocal, or global search, 1 – plocal:

equation image

Logically, plocal should be a monotonically decreasing function of d. Therefore, if a TF starts close enough to its binding site, it is likely to find it using a local search.

How close does a transcription factor need to be to its site to find it with a local search?

In the Supplementary Data, we derive plocal:

equation image

This result is quite intuitive; local searches are likely when a TF starts less than approximately s/2 bp away from its binding site, half the length covered in a single slide. The sliding length, s, depends on a TF's equilibrium dissociation constant for non-specific DNA, An external file that holds a picture, illustration, etc.
Object name is gkn173i16.jpg, and its 1D diffusion coefficient, D1D, as shown in Table 2. In our model organism, E. coli, this value ranges from 30 to 900 bp.

The picture changes when we consider the possibility that some jumps may not be completely randomizing. The current model of jumps assumes that after a TF dissociates from the DNA, all sites on DNA are equally likely to be the association site. However, due to DNA packing, it is likely that there is an increased probability of associating near the dissociation point. Since we do not have a clear picture of DNA packing in the cell, we make the assumption that spatial excursions can be of two extreme types: hops, small dissociations from the DNA in which the protein re-associates to the same region of DNA at a distance smaller than or equal to its persistence length (150 bp), and jumps, excursions in which each site of DNA is equally likely to be the association point (8,10), (Figure 1B). As we show below, this assumption allows us to study spatial effects on the search process, using only information about the DNA density and its persistence length to characterize hops. Since we do not have enough information to completely characterize jumps, we make the simplest assumption, as others have done (8,10), i.e. all landing points are equally probable.

To include hops into the search model, we first need to estimate the probability of hopping, phop, versus jumping, pjump = 1 – phop. Since we assume that hops happen on length scales shorter than the persistence length of DNA, we can consider the DNA as cylinders, where hopping corresponds to a TF returning to the same cylinder it dissociates from, and jumping corresponds to associating to a different cylinder (Figure 1C). Since we are picturing the DNA as cylinders, we can then move to the 2D problem of return to a circle in the presence of other absorbing points, as depicted in Figure 2A, and use both analytical and simulation-based techniques to estimate phop using the DNA density in E. coli.

To estimate phop analytically, we make a further approximation by assuming the picture corresponds to two concentric absorbing circles. The inner circle, with radius R, corresponds to the DNA strand from which the TF dissociates and the outer circle, with radius R+, is an effective shell of absorption by all the other DNA strands. The TF is released at some distance r from the center of the circles. The probability of hopping—returning to the inner circle—is then An external file that holds a picture, illustration, etc.
Object name is gkn173i17.jpg (13). Since this is only an approximation of the true picture, we use this calculation only to set the bounds for phop by assuming R+ is minimally the distance between DNA strands, ∼0.1 µm, and maximally the radius of the E. coli cell, ∼1 µm. We set r = 4 nm and R = 2 nm, which gives us an estimate of phop between 0.82 and 0.89. We note that the probability of hopping is still quite high if the TF is released a few nanometers away from the original DNA strand, as newly translated TFs would be in prokaryotes, where transcription and translation are coupled and TFs are therefore produced in the vicinity of their genes.

Using the same 2D formulation shown in Figure 2A, we also estimate phop by simulation (Materials and Methods section). We find that, for a biological density of DNA, the probability of a TF hopping is large (>0.80). In our subsequent simulations, we assume phop = 0.83, a quantity corresponding roughly to the density of DNA in the E. coli nucleoid. In reality, the DNA density is not uniform over the volume of the nucleoid and phop will vary accordingly. However, in Figure 2B, we show that the change in phop is small for large changes in DNA density. The obtained value of phop allows us to calculate the number of hops a TF makes before it jumps to a new region of DNA as An external file that holds a picture, illustration, etc.
Object name is gkn173i18.jpg. Using simulations we also find, as others have suggested (3,8), that hops are very short, with a median displacement of 1 bp (Figure 2C). Therefore, in the following treatment, we coarse-grain hops into an effective slide. Thus, the effective sliding time is increased by a factor of An external file that holds a picture, illustration, etc.
Object name is gkn173i19.jpg and the effective sliding distance becomes An external file that holds a picture, illustration, etc.
Object name is gkn173i20.jpg. This gives us:

equation image

In E. coli, se ranges from 70 to 2000 bp. Figure 3A shows plocal as a function of d for several values of se, as expressed in Equation (5) and confirmed by simulation. The correspondence between the simulation, which includes hops explicitly, and the analytical estimate validates our proposal to coarse-grains hops into slides. Thus, the addition of hops simply extends the reach of local searches.

Why do sliding length measurements vary widely?

A critical parameter in this analysis is sliding length, s. Our analysis may help to understand the wide range in the measurements of sliding lengths for different proteins (Table 3). Some of these differences are certainly due to differences in the proteins and experimental conditions. In particular, since the non-specific binding of a protein to DNA is driven almost entirely by electrostatics, a protein's non-specific affinity depends strongly on ion concentration (19–21). We also propose that, in some experiments, it is likely that hops are included in the sliding measurement. In the first two experiments listed in Table 3, the experimental designs allow for the unambiguous identification of hops and slides, and the measured slide lengths are on the low end of the scale (14,15).

Table 3.
Recent sliding-length experiments

In the second two experiments, the slide lengths were measured by single-molecule imaging of proteins on DNA (6,7). Given our modeling results, we propose that hops are too short to be seen in a single-molecule experiment. (Halford and Marko also predict this resolution problem (9), though these measurements were not yet made at that point.) The median hop displacement is only 1 bp = 0.34 nm, while the resolution of the experiments is 10–50 nm. Authors of the single-molecule studies have taken the independence of the diffusion coefficient on the ionic strength as an evidence for a lack of hops. Clearly, such small hops could not significantly alter the diffusion coefficient. Our results demonstrate that the major contribution of hops is to duration of sliding rather than to its rate. Thus our model and the notion of small hops help to reconcile these different sliding lengths and seemingly contradicting results about the existence of hops.

Are local searches much faster than global searches?

This analysis of local and global searches is not biologically relevant unless there is a significant difference between the length of each, An external file that holds a picture, illustration, etc.
Object name is gkn173i23.jpg and An external file that holds a picture, illustration, etc.
Object name is gkn173i24.jpg. We again use analytical and simulation based approaches. In the Supplementary Data, we show

equation image

equation image

Since M, the length of the genome, is quite large compared to values of d for which local search is likely (<1000 bp), global searches are indeed much longer than local searches.

Figure 3B shows the simulated and estimated values of search time, ts, as a function of d for several different values of n, the copy number of TFs per cell. Here we use An external file that holds a picture, illustration, etc.
Object name is gkn173i25.jpg, which gives an effective sliding length of ∼700 bp. There is a dramatic difference between ts for small and large d. When considering n = 10, the estimated copy number of LacI tetramers per E. coli cell (22), the search time of 10 TFs for d < 700 bp is less than 3.5 min, but is about 15 min for d > 2000 bp, a time comparable to the duplication time in bacteria.

The initial distance between the TF and its target also affects the reliability of the search. In Figure 3C, we show box-and-whisker plots for the search time of a single TF at d = 50, 200 and 2000 bp. Not only is the median dramatically smaller (under 1 s for d = 50 and 200 bp compared to over 100 min for d = 2000 bp), but the spread of the distributions of vastly different—the interquartile range is 0.1 s for d = 50 bp, 90 min for d = 200 bp, and 170 min for d = 2000 bp.

Why are global searches so slow?

As has been pointed out by several authors, independent of other parameters, the global search time is minimized when τ1D = τ3D, i.e. when the TF spends equal amounts of time sliding along the DNA and diffusing through the DNA volume (8–10). This balances the acceleration of the search due to fewer rounds of search with the deceleration due to longer rounds of search. Since

equation image

where [D] is the concentration of non-specific DNA in the cell, the search time is minimized when An external file that holds a picture, illustration, etc.
Object name is gkn173i26.jpg. In E. coli, [D] = 10−2 M, and the measured values of An external file that holds a picture, illustration, etc.
Object name is gkn173i27.jpg range between 10−3 and 10−6 M (23). Therefore, in vivo, An external file that holds a picture, illustration, etc.
Object name is gkn173i28.jpg is not optimized to minimize search time and can result in global search times between 15 and 500 min for n = 1 TF (Figure 4).

Figure 4.
The global search time for a single TF depends non-monotically on its affinity for non-specific DNA, measured by the dissociation constant, An external file that holds a picture, illustration, etc.
Object name is gkn173i29.jpg. The search time is minimized when An external file that holds a picture, illustration, etc.
Object name is gkn173i30.jpg is equal to the concentration of non-specific DNA, [D] = 10−2 M. However, ...

Several other studies that have examined the facilitated diffusion mechanism estimated that a TF could find its binding site much more quickly than our estimates of An external file that holds a picture, illustration, etc.
Object name is gkn173i32.jpg. In their seminal work, Berg, Winter and von Hippel study in some depth the rate of the TF search process (3,4,19). In the concluding paper of a three-paper series, they put together measured and estimated parameters for the search and arrive at a search time of ∼2 s for n = 10 (4). In this estimate, however, they used values of D1D and D3D that are about an order of magnitude larger than recently measured and currently accepted values for in vivo diffusion (24,25), and An external file that holds a picture, illustration, etc.
Object name is gkn173i33.jpg, a value at the upper limit of the range. Using our values, we get a search time of ∼100 s for 10 TFs and ∼15 min for 1 TF. Three other groups use different approaches to arrive at similar search time expressions. Coppey et al. use realistic parameter values to estimate a rapid search time for a short piece of DNA, but since they are considering in vitro experiments with a restriction enzyme, they do not consider the case where the DNA length is genome-sized (8). In their estimates, Halford and Marko assume that D1D and D3D are equal (and an order of magnitude larger than the measured in vivo D3D) and that s is optimal, resulting in a rapid search time (9). Slutsky and Mirny also assume D1D and D3D are equal and fast and that τ1D = τ3D, also resulting in a rapid search time (10).

Since slow global searches are in part due to fairly strong non-specific binding, this naturally leads to the question of why strong non-specific binding would exist. We suggest two possibilities. (i) Strong non-specific binding is functionally important. For example, this binding can be important for relief of repression when a repressor's affinity for its specific site is reduced by ligand binding (23,26). In this case, strong non-specific binding will allow the non-specific sites to out-compete the specific site. For a treatment of other equilibrium aspects of gene regulation, see (27,28). (ii) There is a design limitation. If it is generally true that DNA binding domains use the same set of amino acids to bind both specific and non-specific sites (20), albeit in different ways, there may be a limitation on how weak the non-specific binding can be compared to a strong specific binding.


In this article, we examine the distance-dependence of TF search time and find that (i) the search time is distance-dependent, with local searches likely at distances less than sliding length of a TF; (ii) hops lengthen the reach of local searches by increasing the effective sliding length by a factor of ∼3; and (iii) due to a TF's strong non-specific affinity for DNA and slow diffusion, global search can be slow. Therefore, low copy-number TFs will find their sites markedly faster if they maintain a small initial distance to their binding sites.

The role of DNA conformation

In our model, we attempt to describe the TF search process more realistically by including hops. However, we still assume that jumps are completely randomizing. This assumption is a bit simplistic, though probably sensible, given the data at hand. In reality, the compact conformation of DNA can make jumps non-uniform, e.g. making it more likely for a protein to associate to DNA a certain distance away from a dissociation point (but much further than a hop). For example, the proposed solenoid structure of bacterial DNA can make jumps to the next coil more likely than to a remote coil. Such correlated jumps may make the search more redundant, thus (i) making the global search slower and (ii) making the local search spread further that a single effective sliding length se. Some have addressed this effect (11,29), and though progress is being made (30), experimental data on the in vivo conformation of prokaryotic DNA is still scarce, so it is still unclear what role DNA conformation plays in the search process in live cells. We also note that our work neglects the presence of other DNA-binding proteins that may also interfere with the search process (31).

Biological implications

The arrival of a TF to its regulatory site is an essential step in the process of gene regulation. While this step may not necessarily be the rate-limiting one, significant delays in the arrival time can make gene regulation sluggish, thus slowing down response to environmental stimuli and causing the organism to be less fit. We note that these arguments apply to both repressors and activators. A slow search by an activator can lead to delayed gene activation, while a slow search by repressor can lead to unrepressed activity of certain genes or leaky repression. To avoid the adverse effects of slow regulation, we propose that prokaryotes may take advantage of fast local search through a mechanism described below.

Since transcription and translation are coupled in bacteria, proteins are produced in situ—near their gene's physical location on the chromosome. We suggest that if a TF gene and its binding site are within se bp of each other, this co-localization enables a local search and presumably faster gene regulation. This provides a kinetic advantage that is arguably less costly that maintaining a larger copy number of a TF to compensate for slow search. We believe that strong support for our hypothesis can be found in the organization of prokaryotic genomes. A number of groups have observed that prokaryotic TF genes tend to be closer to their binding sites than expected at random (16,17). An explanation offered is the selfish gene cluster hypothesis—the proximity is favorable for horizontal gene transfer of an operon together with its regulator (32,33). Our model offers a kinetic explanation, which is a modified version of Droge and Muller-Hill's idea of ‘local concentration’ (22). In another study, we use bioinformatics to show that TFs with a small number of targets in the genome are likely to be co-localized with their target sites, on length scales comparable to our estimates of se (18). We also demonstrate that the observed co-localization and gene orientation cannot be explained by selfish gene hypothesis, further supporting our kinetic hypothesis. For highly pleiotropic TFs with a larger number of target sites, co-localization is impossible, and we suggest rapid search is achieved by high copy number. For example, ArcA, is a highly pleiotropic TF with over 50 binding sites in the E. coli genome (34) is estimated to have a copy number of 200 copies per cell (35).

In eukaryotes, where transcription and translation happen in different compartments of the cell, co-localization of this type is clearly not possible. However, eukaryotes have highly organized nuclei, and the compartmentalization may lead to a high concentration of a TF in the vicinity of its binding site (22). Additionally, it appears that some TFs are constitutively bound to their binding sites and await an activation signal [e.g. Gal4 (36)].

Our simulations also demonstrate that a local search has smaller variance of the arrival time. Noise in gene expression is shown to be in part determined by initiation or repression of transcription. Variability in the arrival of a TF to a promoter can greatly increase temporal noise and cell-to-cell variance of gene expression (37–40). Thus cells may employ a local search not only to reduce delays in gene regulation, but also to control (though not necessarily reduce) noise in gene expression.

To estimate the effects of search time on noise, we note that Cai et al. have shown that, under the control of a repressor like LacI, protein production occurs in bursts, presumably due to the competition between the repressor and RNA polymerase (41). The frequency of the bursts is proportional to the search time (42). Therefore, the baseline production of a protein that is repressed by a single repressor will scale directly with search time.

Comparison with a recent in vivo experiment

A recent in vivo single-molecule experiment shows that the 1D/3D search strategy is likely at work in living cells (25). The experiments studied the search by Lac repressor for its cognate sites. Lac repressor was in its native orientation, i.e. co-localized with the target site, and thus produced at a distance of about 300 bp from the site. The measured search time for a single protein per cell was approximately 6 min, which is somewhat faster that our estimated global search time if we were to assume An external file that holds a picture, illustration, etc.
Object name is gkn173i34.jpg is 10–3 M, a value at the very upper limit of the measured in vivo range (23). Since the protein synthesis was co-localized with its site, and the YFP marker used had short maturation time of 7 min, it is hard to delineate contributions of the local and global search. A more direct test would be to measure and compare the search time for a system where the TF gene is distant from its target site.

Testing the proposed model with experiments

We propose a number of ways to test the distance-dependence of the search time. In each case, we propose to compare two strains of E. coli, one in which the gene of the TF of interest is less than se bp away from its binding site, e.g. within a few hundred bp, and one in which the gene is much farther away, e.g. over 10 kbp away. In the first strain, a TF will be synthesized near its binding site, making local search likely, and in the second strain, the lack of co-localization will make local search unlikely. Since all the necessary parameters are not known with great accuracy, it is hard to predict the exact differences in search times and the downstream effects between the two strains; however, given the large estimated differences between local and global search times, we would expect the properties measured in the proposed experiments to be detectably different.

First, in vivo single molecule measurements (25) can be used to directly measure the binding time in the two strains. Second, one can measure the consequences of co-localization on gene expression by comparing the degree of repression (43), the noise in gene expression (37,38,44) or the dynamics of individual the bursts of expression (45) in strains where the TF of interest represses a reporter gene. Finally, one can compare the more subtle effects of the timing of repression, which are not directly observable but have an impact on fitness. This can be done using competitive growth experiments. One can compete the two strains, both producing a repressor that controls the production of a deleterious protein, but different in relative locations of the repressor gene and its target gene. If our model is correct, the strain with the locally produced repressor will have less leaky repression and therefore a growth advantage over the other strain.


While this paper was in press, we were pointed to Skoko et al. (47), which suggests that some proteins stay bound to DNA much longer than expected, given their KdNS. In fact, our picture of hops is consistent with this experimental observation, i.e. multiple re-associations will allow proteins to remain bound to DNA for long time, particularly when competitor DNA is scarce or absent. This mechanism suggests that some experimental methods to measure KdNS may allow proteins to rapidly re-associate to the same piece of DNA and some may not, leading to different measurements of the apparent KdNS for the same protein.


Supplementary Data are available at NAR Online.

[Supplementary Data]


Z.W. is a recipient of a Howard Hughes Medical Institute Predoctoral Fellowship. L.M. is funded by the NEC Research Fund and by the National Center for Biomedical Computing i2b2. We thank Nickolay Khazanov, Jason Leith, and John Tsang for helpful discussions and the anonymous reviewers for constructive comments on the manuscript. Funding to pay the Open Access publication charges for this article was provided by the NEC Research Fund and the National Center for Biomedical Computing, i2b2.

Conflict of interest statement. None declared.


1. Riggs AD, Bourgeois S, Cohn M. The lac repressor-operator interaction. 3. Kinetic studies. J. Mol. Biol. 1970;53:401–417. [PubMed]
2. Richter PH, Eigen M. Diffusion controlled reaction rates in spheroidal geometry. Application to repressor—operator association and membrane bound enzymes. Biophys. Chem. 1974;2:255–263. [PubMed]
3. Berg OG, Winter RB, von Hippel PH. Diffusion-driven mechanisms of protein translocation on nucleic acids. 1. Models and theory. Biochemistry. 1981;20:6929–6948. [PubMed]
4. Winter RB, Berg OG, von Hippel PH. Diffusion-driven mechanisms of protein translocation on nucleic acids. 3. The Escherichia coli lac repressor—operator interaction: kinetic measurements and conclusions. Biochemistry. 1981;20:6961–6977. [PubMed]
5. Shimamoto N. One-dimensional diffusion of proteins along DNA. Its biological and chemical significance revealed by single-molecule measurements. J. Biol. Chem. 1999;274:15293–15296. [PubMed]
6. Blainey PC, van Oijen AM, Banerjee A, Verdine GL, Xie XS. A base-excision DNA-repair protein finds intrahelical lesion bases by fast sliding in contact with DNA. Proc. Natl Acad. Sci. USA. 2006;103:5752–5757. [PMC free article] [PubMed]
7. Wang YM, Austin RH, Cox EC. Single molecule measurements of repressor protein 1D diffusion on DNA. Phys. Rev. Lett. 2006;97:048302. [PubMed]
8. Coppey M, Benichou O, Voituriez R, Moreau M. Kinetics of target site localization of a protein on DNA: a stochastic approach. Biophys. J. 2004;87:1640–1649. [PMC free article] [PubMed]
9. Halford SE, Marko JF. How do site-specific DNA-binding proteins find their targets? Nucleic Acids Res. 2004;32:3040–3052. [PMC free article] [PubMed]
10. Slutsky M, Mirny LA. Kinetics of protein-DNA interaction: facilitated target location in sequence-dependent potential. Biophys. J. 2004;87:4021–4035. [PMC free article] [PubMed]
11. Hu T, Grosberg A, Shklovskii B. How proteins search for their specific sites on DNA: the role of DNA conformation. Biophys. J. 2006;90:2731–2744. [PMC free article] [PubMed]
12. Hu T, Shklovskii BI. How does a protein search for the specific site on DNA: the role of disorder. Phys. Rev. 2006;74:021903. [PubMed]
13. Redner S. A Guide to First-Passage Processes. Cambridge, New York: Cambridge University Press; 2001.
14. Gowers DM, Halford SE. Protein motion from non-specific to specific DNA by three-dimensional routes aided by supercoiling. EMBO J. 2003;22:1410–1418. [PMC free article] [PubMed]
15. Gowers DM, Wilson GG, Halford SE. Measurement of the contributions of 1D and 3D pathways to the translocation of a protein along DNA. Proc. Natl Acad. Sci. USA. 2005;102:15883–15888. [PMC free article] [PubMed]
16. Warren PB, ten Wolde PR. Statistical analysis of the spatial distribution of operons in the transcriptional regulation network of Escherichia coli. J. Mol. Biol. 2004;342:1379–1390. [PubMed]
17. Hershberg R, Yeger-Lotem E, Margalit H. Chromosomal organization is shaped by the transcription regulatory network. Trends Genet. 2005;21:138–142. [PubMed]
18. Kolesov G, Wunderlich Z, Laikova ON, Gelfand MS, Mirny LA. How gene order is influenced by the biophysics of transcription regulation. Proc. Natl Acad. Sci. USA. 2007;104:13948–13953. [PMC free article] [PubMed]
19. Winter RB, von Hippel PH. Diffusion-driven mechanisms of protein translocation on nucleic acids. 2. The Escherichia coli repressor—operator interaction: equilibrium measurements. Biochemistry. 1981;20:6948–6960. [PubMed]
20. Kalodimos CG, Biris N, Bonvin AM, Levandoski MM, Guennuegues M, Boelens R, Kaptein R. Structure and flexibility adaptation in nonspecific and specific protein-DNA complexes. Science. 2004;305:386–389. [PubMed]
21. Mossing MC, Record M.T., Jr Thermodynamic origins of specificity in the lac repressor-operator interaction. Adaptability in the recognition of mutant operator sites. J. Mol. Biol. 1985;186:295–305. [PubMed]
22. Droge P, Muller-Hill B. High local protein concentrations at promoters: strategies in prokaryotic and eukaryotic cells. Bioessays. 2001;23:179–183. [PubMed]
23. Revzin A. The Biology of Nonspecific DNA Protein Interactions. London: CRC Press; 1990.
24. Elowitz MB, Surette MG, Wolf PE, Stock JB, Leibler S. Protein mobility in the cytoplasm of Escherichia coli. J. Bacteriol. 1999;181:197–203. [PMC free article] [PubMed]
25. Elf J, Li GW, Xie XS. Probing transcription factor dynamics at the single-molecule level in a living cell. Science. 2007;316:1191–1194. [PMC free article] [PubMed]
26. von Hippel PH, Revzin A, Gross CA, Wang AC. Non-specific DNA binding of genome regulating proteins as a biological control mechanism: I. The lac operon: equilibrium aspects. Proc. Natl Acad. Sci. USA. 1974;71:4808–4812. [PMC free article] [PubMed]
27. Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Kuhlman T, Phillips R. Transcriptional regulation by the numbers: applications. Curr. Opin. Genet. Dev. 2005;15:125–135. [PMC free article] [PubMed]
28. Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Phillips R. Transcriptional regulation by the numbers: models. Curr. Opin. Genet. Dev. 2005;15:116–124. [PMC free article] [PubMed]
29. Lomholt MA, Ambjornsson T, Metzler R. Optimal target search on a fast-folding polymer chain with volume exchange. Phys. Rev. Lett. 2005;95:260603. [PubMed]
30. Gitai Z, Thanbichler M, Shapiro L. The choreographed dynamics of bacterial chromosomes. Trends Microbiol. 2005;13:221–228. [PubMed]
31. Flyvbjerg H, Keatch SA, Dryden DT. Strong physical constraints on sequence-specific target location by proteins on DNA molecules. Nucleic Acids Res. 2006;34:2550–2557. [PMC free article] [PubMed]
32. Lawrence JG. Gene organization: selection, selfishness, and serendipity. Annu. Rev. Microbiol. 2003;57:419–440. [PubMed]
33. Lawrence JG, Roth JR. Selfish operons: horizontal transfer may drive the evolution of gene clusters. Genetics. 1996;143:1843–1860. [PMC free article] [PubMed]
34. Keseler IM, Collado-Vides J, Gama-Castro S, Ingraham J, Paley S, Paulsen IT, Peralta-Gil M, Karp PD. EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res. 2005;33:D334–D337. [PMC free article] [PubMed]
35. Link AJ, Robison K, Church GM. Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12. Electrophoresis. 1997;18:1259–1313. [PubMed]
36. Selleck SB, Majors JE. In vivo DNA-binding properties of a yeast transcription activator protein. Mol. Cell. Biol. 1987;7:3260–3267. [PMC free article] [PubMed]
37. Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A. Regulation of noise in the expression of a single gene. Nature Genet. 2002;31:69–73. [PubMed]
38. Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297:1183–1186. [PubMed]
39. Raser JM, O'Shea EK. Control of stochasticity in eukaryotic gene expression. Science. 2004;304:1811–1814. [PMC free article] [PubMed]
40. van Zon JS, Morelli MJ, Tanase-Nicola S, ten Wolde PR. Diffusion of transcription factors can drastically enhance the noise in gene expression. Biophys. J. 2006;91:4350–4367. [PMC free article] [PubMed]
41. Cai L, Friedman N, Xie XS. Stochastic protein expression in individual cells at the single molecule level. Nature. 2006;440:358–362. [PubMed]
42. Friedman N, Cai L, Xie XS. Linking stochastic dynamics to population distribution: an analytical framework of gene expression. Phys. Rev. Lett. 2006;97:168302. [PubMed]
43. Oehler S, Amouyal M, Kolkhof P, von Wilcken-Bergmann B, Muller-Hill B. Quality and position of the three lac operators of E. coli define efficiency of repression. EMBO J. 1994;13:3348–3355. [PMC free article] [PubMed]
44. Maheshri N, O'Shea EK. Living with noisy genes: how cells function reliably with inherent variability in gene expression. Annu. Rev. Biophys. Biomol. Struct. 2007;36:413–434. [PubMed]
45. Golding I, Paulsson J, Zawilski SM, Cox EC. Real-time kinetics of gene activity in individual bacteria. Cell. 2005;123:1025–1036. [PubMed]
46. Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. [PubMed]
47. Skoko D, Wong B, Johnson RC, Marko JF. Micromechanical analysis of the binding of DNA-bending proteins HMGB1, NHP6A, and HU reveals their ability to form highly stable DNA-protein complexes. Biochemistry. 2004;43:13867–13874. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Compound
    PubChem Compound links
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...