# Timing and Dynamics of Single Cell Gene Expression in the Arabinose Utilization System

^{*}Georg Fritz,

^{*}

^{†}Ulrich Gerland,

^{†}Kirsten Jung,

^{‡}and Joachim O. Rädler

^{*}

^{*}Department für Physik und CeNS, Ludwig-Maximilians-Universität, Munich, Germany;

^{†}Institut für Theoretische Physik, Universität zu Köln, Cologne, Germany; and

^{‡}Department Biologie I, Bereich Mikrobiologie, Ludwig-Maximilians-Universität, Munich, Germany

## Abstract

The arabinose utilization system of *Escherichia coli* displays a stochastic all-or-nothing response at intermediate levels of arabinose, where the population divides into a fraction catabolizing the sugar at a high rate (on-state) and a fraction not utilizing arabinose (off-state). Here we study this decision process in individual cells, focusing on the dynamics of the transition from the off- to the on-state. Using quantitative time-lapse microscopy, we determine the time delay between inducer addition and fluorescence onset of a GFP reporter. Through independent characterization of the GFP maturation process, we can separate the lag time caused by the reporter from the intrinsic activation time of the arabinose system. The resulting distribution of intrinsic time delays scales inversely with the external arabinose concentration, and is compatible with a simple stochastic model for arabinose uptake. Our findings support the idea that the heterogeneous timing of gene induction is causally related to a broad distribution of uptake proteins at the time of sugar addition.

## INTRODUCTION

Bacteria have sophisticated signal transduction and gene regulatory networks for rapid adaptation to environmental changes. In recent years it became increasingly recognized, that the dynamical response of these biochemical reaction networks is subject to significant stochastic fluctuations (1), which can lead to heterogeneous behavior across cellular populations. Examples include the transient differentiation of *Bacillus subtilis* in its late exponential phase (2,3), bacterial persistence in *Escherichia coli* (4), and the mating pheromone response pathway in yeast (5). In many of these systems, positive feedback plays a fundamental role, since it gives rise to bistability and thereby causes two clearly distinct gene expression states (6). It has been demonstrated that biochemical noise induces stochastic transitions between the two stable states, and it was suggested that the resulting population heterogeneity provides selective advantages for colony growth in fluctuating environments (7,8).

A prototypic class of positive feedback systems are the inducible sugar utilization systems, in which bistability is caused by the autocatalytic positive feedback of the sugar on its own uptake proteins. These systems allow bacteria to grow on less favorable carbon sources than glucose: For instance, in a medium where lactose is the only energy source, *E. coli*'s lactose utilization (*lac*) system either imports and catabolizes lactose at a high rate (on-state), or it does not use lactose at all (off-state) (9). This bistable behavior has drastic effects on the behavior at the population level. When a high amount of external lactose was added to a previously uninduced culture, all cells in the population switched from the off- to the on-state. However, at lower sugar concentrations only a fraction of cells switched to the on-state while others remained in the off-state (9,10).

Here, we are interested in the dynamics of such a switching process on the single cell level. We study these dynamics in the context of the arabinose utilization (*ara*) system of *E. coli* (11), another well-characterized bistable system (see Fig. 1). In this case, arabinose is imported by the high-affinity low-capacity transporter AraFGH and the low-affinity high-capacity transporter AraE. If internal arabinose exceeds a threshold concentration, it activates AraC, which in turn promotes expression of *araFGH*, *araE*, and the genes for arabinose catabolism, *araBAD*. Siegele and Hu (12) analyzed population distributions at intermediate sugar levels, and revealed that the *ara* system displays an all-or-nothing expression pattern similar to the *lac* system. They conjectured that in uninduced cells the stochastic background expression of the *ara* regulon leads to a wide distribution of *ara* uptake proteins. Addition of arabinose would then lead to different rates of arabinose accumulation, causing heterogeneous timing of gene induction within the population. At a given time there would be a fraction of induced and a fraction of uninduced cells, and the depletion of arabinose by the metabolism of the induced cells could explain the fixation of the all-or-nothing response. This conjecture is consistent with a computational study of autocatalytic expression systems (13) and experiments which placed *araE* under the control of a constitutive promoter, finding homogeneous gene expression in the population (14–16). The dynamics of switching processes has also been studied using flow cytometry techniques, which yield a time series of population distributions of gene expression levels (17,18).

In this study, we take a different experimental approach: Rather than recording population distributions, we use quantitative time-lapse fluorescence microscopy to follow the expression dynamics of the switching process in many cells, individually. In a physics analogy, this is akin to following the trajectories of many particles, instead of recording their spatial density distribution at different time points. Clearly, the distributions can be obtained from the trajectories, but not vice versa; i.e., the trajectories contain more information. In this case, this additional information is particularly useful to disentangle different variables that affect the response of individual cells: The observed time-dependent fluorescence level is the final output of a series of biochemical processes, which can be grouped into two connected subsystems—an uptake module and a reporter module. Both modules experience noise, which, to a first approximation, can be subsumed into a single parameter for each module. As we will see, one can extract these two parameters for each cell by fitting an appropriate model to the fluorescence trajectory of the cell. As a result, we can directly obtain the separate distributions for these two parameters, and even measure their correlations. Note that this analysis would not have been possible based on population distributions of gene expression levels.

Using this approach, we address the question raised by Siegele and Hu (12), i.e., is the all-or-nothing response of the *ara* system associated with heterogeneous timing of gene induction, and, if so, is the heterogeneous timing causally related to a wide distribution of *ara* uptake proteins? At subsaturating sugar levels, we observe a significant delay between addition of inducer and increase of fluorescence, which is indeed broadly distributed. To clarify the origin of this delay and its broad distribution, it is necessary to separate the intrinsic lag of the GFP expression dynamics from the time-lag inherent to the stochastic arabinose uptake. To this end, we leverage our microfluidic setup to separately measure the distribution of GFP maturation times across an *E. coli* population. We also record the cell-to-cell variation of the growth rates. Using a simple quantitative model for the expression dynamics, we then extract the intrinsic timing statistics for gene induction. We find that this distribution is well described by an analytical delay time distribution derived from a stochastic model for the uptake module. Our results support the conclusion that the heterogeneous timing is indeed due to a wide distribution of *ara* uptake proteins across the population.

## MATERIALS AND METHODS

### Bacterial strain and plasmid

*E. coli* strain LMG194 (*F* ^{−} *lacX74 galE galK thi rpsL phoA* (*Pvu*II) *ara714 leu**Tn10*) (19) was transformed with plasmid pBAD24-GFP (this work) using a standard method as described elsewhere (20). The gene *gfpmut3* (21) encoding the green fluorescent protein GFPmut3 was amplified by PCR with primers GFP-*Kpn*I sense (5′-TACCATGGTACCAAGTAAAGGAGAAGAACTTTTC-3′) and GFP-*Hin*dIII antisense (5′-CATAGTAAGCTTTTATTTGTATAGTTCATCCATGCC-3′) using plasmid pJBA29 (22) as a template. The DNA-fragment was cut with restriction endonucleases *Kpn*I and *Hin*dIII, and was then ligated into similar treated vector pBAD24 (19), resulting in plasmid pBAD24-GFP. The correct insertion of the fragment was verified by restriction analysis as well as by DNA sequence analysis.

### Growth conditions

Cells were grown in LB medium (23) or M63 minimal medium (19) containing 0.2% (w/v) glycerol as C-source. When indicated, 0.01%, 0.02%, 0.05%, or 0.2% (w/v) arabinose was added to induce GFP expression. Bacteria were inoculated from single colonies grown on LB agar plates and grown overnight (37°C, shaking at 300 rpm) in M63 medium. Overnight cultures were diluted 1:50 into fresh M63 medium and cultured for 2 h. Bacteria were subsequently diluted in prewarmed medium to an appropriate density and were then applied to one channel of a poly-L-lysine-coated microfluidic chamber (*μ*-Slide VI; Ibidi, Martinsried, Germany). The slide was then incubated at 37°C for several minutes. By softly flushing the channel with prewarmed medium supplemented with the desired arabinose concentration, gene expression was induced and the sample was rinsed at the same time. After the preparation procedure, the vast majority of the bacteria adhered with their long axis parallel to the surface.

### Time-lapse microscopy

Time-lapse experiments were performed on a fully automated inverted microscope (Axiovert 200M, Zeiss, Oberkochen, Germany) equipped with a motorized stage (Prior Scientific, Cambridge, UK). All devices were controlled by Andor IQ software (Andor, Belfast, Northern Ireland). Fluorescence illumination was provided by an X-cite120 light source (EXFO, Quebec, Canada). An appropriate filter set (excitation: 470/40; beamsplitter 495; emission: 525/50; filter set Nr 38; Zeiss) was used. Bright field and fluorescence images of several fields in one sample were acquired every 5 min with a highly sensitive EMCCD camera (iXon DV885; Andor) through an oil-immersion 100× plan-neofluar objective with NA 1.3 (Zeiss), with acquisition times of 0.1 s to 0.2 s. To further prevent photobleaching and photodamage all light sources were shuttered between exposures and an orange filter was used in the bright-field light path. The temperature in the sample environment was maintained at 37°C using a custom-built heating box. Focalcheck fluorescence microspheres (Invitrogen, Karlsruhe, Germany) were used to correct for output variations of the lamp.

### Data analysis

ImageJ (24) and Igor Pro 4.0 (WaveMetrics, Lake Oswego, OR) were used for data analysis. Cell outlines were created by thresholding the bright-field images. Total fluorescence was measured as the sum over all pixel values within the outline in the corresponding background-corrected fluorescence image. Time traces were assembled by tracking the cells manually. As photobleaching was found to be negligible for the given experimental system, fluorescence traces were fitted without further processing.

### Measurement of the GFP maturation time distribution in vivo

The maturation time in single cells was determined using an approach similar to the one established in Gordon et al. (25): Translation was blocked by the addition of 200 *μ*g/ml chloramphenicol, 30 min after the induction of *gfp*-expression with 0.2% arabinose. Fluorescence images were acquired every 3–5 min before and after inhibition. As this measurement was more sensitive, the illumination was reduced and the EM gain of the camera was used. Photobleaching could thus again be neglected. Cellular fluorescence was determined by summing all pixel values above the background level for each bacterium. This method is qualitatively equal to the use of cell outlines as described above, but can only be applied if the range of fluorescence values is limited and bacteria do not grow strongly. The resulting maturation time courses were fitted by an exponential function.

## RESULTS

### Single cell induction kinetics

To study the induction kinetics of the *ara* system, we use an *E. coli* strain where both *araBAD* and *araC* are deleted (19). It is transformed with the reporter plasmid pBAD24-GFP, containing the *araC* gene and the rapidly maturing GFP variant *gfpmut3* (21) (which is under the control of the *P*_{BAD} promoter; see Materials and Methods). The *araC* gene is supplied on the plasmid to guarantee full functionality of the DNA loop required for repression of *P*_{BAD} in the absence of arabinose (11) and to provide the proper stoichiometry of transcription factors and *P*_{BAD} promoters. The chromosomal deletion of *araBAD* avoids the negative feedback of the internal arabinose catabolism. This feedback complicates the system, but is irrelevant for our questions, which focus on the kinetics of the induction when arabinose first becomes available externally. The gene regulatory circuit of our system is illustrated in Fig. 1.

To perform the time-lapse fluorescence microscopy, we introduce the bacteria into a microfluidic chamber, where they attach to the poly-L-lysine coated chamber wall. The microfluidic chamber provides homogeneous external conditions for the bacteria and can be used to rapidly exchange the medium. At *t* = 0 min, we induce the bacteria with 0.2% (13.3 mM), 0.05% (3.33 mM), 0.02% (1.33 mM), or 0.01% (0.66 mM) arabinose, and then record the time-evolution of GFP fluorescence in single cells. Representative fluorescence trajectories for the highest (0.2%) and the lowest (0.01%) arabinose concentration are shown in Fig. 2, *a* and *b*, respectively.

*t*= 0 min with 0.2% arabinose (

*a*) and 0.01% arabinose (

*b*) (

*open circles*). The traces were analyzed up to the first cell division, which results

**...**

For all arabinose concentrations, the individual time-traces of each cell appear rather smooth and deterministic, whereas there is a significant variation in the response from cell to cell. We also observe a time lag between the addition of arabinose and the onset of fluorescence. With decreasing arabinose concentration, the typical lag time becomes longer, and its cell-to-cell variation becomes more pronounced. Below, we will devise a rigorous way to quantify this delay. Here, we only apply a simple thresholding procedure to extract an apparent lag time. Using an intensity threshold of 2.5 × 10^{4} fluorescence units, we determine an apparent lag time of 16 ± 2.5 min at 0.2% arabinose and a more substantial delay of 34 ± 10 min at 0.01% arabinose. In the latter case, ~10% of the bacteria do not show any fluorescence within our time window of 70 min.

With the sudden increase of the external arabinose concentration at *t* = 0 min, a cascade of biochemical processes is triggered, culminating in the fluorescent output signal measured in our experiment. To narrow down the origin of the stochasticity in the apparent lag time, we need to analyze the individual steps in this cascade. For this analysis, it is useful to separate the system into two distinct modules, an uptake module and a GFP expression module, as depicted in Fig. 3 *a*. The uptake module not only comprises arabinose import (represented here by an effective uptake protein, Upt, that subsumes transport by AraE and AraFGH) but also includes the positive feedback of arabinose on the uptake protein. The expression module turns on the production of the output signal, when internal arabinose reaches a threshold level (26). The delay time *τ*_{D} that is required to reach this threshold is solely determined by the uptake module. However, GFP fluorescence does not follow promoter activation instantaneously. Instead, the processes of transcription, translation, and GFP maturation depicted in Fig. 3 *b* also generate a dynamical delay and thereby contribute to the apparent delay estimated above. To quantitatively estimate the intrinsic delay *τ*_{D} and its statistics, we now scrutinize the expression module in detail.

### Quantitative characterization of the expression module

#### GFP maturation time

A significant portion of the dynamic delay of the expression module is incurred by GFP maturation, the process whereby the folded protein becomes fluorescent. The rate-limiting reaction is an oxidation with a time constant of several minutes up to several hours (27), depending on the variant of the protein and possibly on the organism. However, for our present purpose, we not only need the average time constant, but also need to know whether there is a large cell-to-cell variation associated with the maturation process. With our microfluidic setup, we can directly probe this cell-to-cell variation experimentally, under the same conditions as in the induction experiments. First, we induce bacteria with 0.2% arabinose and then inhibit protein synthesis in situ by flushing the channel with the antibiotic chloramphenicol. The resulting fluorescence trajectories cease to increase ~15 min after the addition of the antibiotic (see Fig. 4 *a* for a few representative trajectories). Following the rationale established in Gordon et al. (25), this behavior reflects the maturation dynamics of the remaining, nonfluorescent GFPs. The distribution of time-constants *τ*_{m} of GFP maturation shown in Fig. 4 *b* was obtained from exponential fits to 77 single-cell time series (*solid lines* in Fig. 4 *a*). We find an average maturation time of *τ*_{m} = 6.5 min and a standard deviation of 0.6 min, i.e., a cell-to-cell variation of only ~10%.

*a*, GFP expression was induced with 0.2% arabinose at

*t*= 0 min and protein synthesis was inhibited by addition of 200

*μ*g/ml chloramphenicol at

*t*= 30 min, as indicated by the arrow.

**...**

Our finding of a relatively small cell-to-cell variation suggests that the maturation process is largely independent of the internal state of the cell in *E. coli*. This appears plausible, given that the oxidation reaction does not depend on intracellular components (27). For comparison, measurements of the maturation times of YFP and CFP in yeast (25) found considerably longer maturation times of ~40 min, but only a slightly larger relative cell-to-cell variation (15–20%). Moreover, from in vitro measurements of various YFP variants, oxidation timescales as low as 2–8 min were determined (28), indicating that the rapid maturation time detected in our experiment is conceivable in vivo.

#### Gene copy number

Since our GFP reporter is encoded on a plasmid, the average copy-number of the plasmid and its cell-to-cell variation are important properties of the expression module. The plasmid pBAD24 has an average copy number comparable to pUC (29), which is present in ~55 copies per cell (30). Assuming plasmid production and dilution with constant rates, we expect Poissonian fluctuations of ~ plasmids (13%). In similar plasmids, ColE1 and R1, negative feedback is known to reduce the copy-number variations below the Poisson limit (31). This may also apply to pBAD24, which would make the variation even less significant. We expect that the plasmid copy number grows proportional to the volume of the cell, such that the concentration of plasmids remains constant. Hence, we will assume that the rate *γ* of gene replication in Fig. 3 *b* equals the rate of volume expansion of the cells.

#### Cell growth

As the above discussion of the gene copy number shows, the distribution of growth rates is another characteristic affecting the quantitative properties of the expression module. We analyzed the growth of individual cells in the microfluidic channel by recording the time-evolution of their area detected under the microscope. Since the rod-shaped *E. coli* cells grow mainly along their principal axis (see image panels in Fig. 2), the growth rate of the cell area is a proxy for the growth rate by cell volume. From exponential fits (32) to 84 time series of the cell area we found a distribution of time constants for cell growth with an average of 50 min and a standard deviation of 6 min. Hence, the cell-to-cell variations of the growth rate are also relatively small. This result indicates that the microchemical conditions in our channel are sufficiently constant to guarantee a reproducible growth state of the cells. We also found that the doubling time was independent of the arabinose concentration, consistent with the fact that in this strain arabinose cannot be catabolized and used as an energy source.

#### mRNA half-life and protein expression rate

Finally, the dynamics of the expression module is dependent on the rate constants for *gfp* expression and mRNA degradation. Average mRNA half-lives were determined for most of *E. coli*'s genes (33) and are typically in the range 3–8 min. The work of Smolke et al. (34) indicates that the population-averaged half-life of *gfp* mRNA is in the same range; for our analysis below, we will assume an average half-life of 6 min. In contrast, there is currently no report on the cell-to-cell variation of *gfp* mRNA half-lives. We expect that such a variation would mainly be produced by cell-to-cell variations of RNase abundance and other components required for transcript turnover. These components have been shown to vary with the growth rate (35). Since the growth rate varies only by ~10% from cell to cell in our experiment (see above), we estimate the relative cell-to-cell variations of mRNA half-life to be similar. This may be an overestimate, since the degradation machinery negatively autoregulates its own expression (36), a mechanism known to reduce gene expression noise (37).

The protein expression rate has been quantified experimentally at the single-cell level for the *P*_{R} promoter of phage *λ*, and substantial cell-to-cell variations of ~35% were determined (38). These large relative differences likely stem from cell-to-cell variations in global cellular components such as RNA polymerases or ribosomes. We expect similar variations for GFP expression from the *P*_{BAD} promoter.

### Distribution of GFP expression rate and intrinsic delay time

Given the above characterization of the expression module, we can now construct a simple quantitative model for its dynamic response, and then use this model to extract the intrinsic delay *τ*_{D}. The smooth shape of the time series in Fig. 2 suggests that the dynamics of individual cells follows a rather deterministic fate, while the differences between the cells stem from cell-to-cell variation of the reaction rates. Therefore, we use a deterministic rate equation model to describe the expression dynamics within a single cell, but allow for cell-to-cell variation in the model parameters. This model follows the reaction scheme depicted in Fig. 3 *b*: Transcription of *gfp* mRNA from the promoter *P*_{BAD} is turned on at *t* = *τ*_{D} and then remains constant at rate *α*_{x}. However, the number of plasmids (and hence gene copies) increases with rate *γ*, which equals the cell-doubling rate, so that the plasmid copy number *P* remains stable in the bacterial population. We denote the mRNA degradation rate by *λ*_{x}, and the translation and maturation rates of GFP by *α*_{y} and respectively (see Appendix A for details).

Within this model, the time-evolution of the total number of fluorescent GFP molecules in a cell, *Z*(*τ*), is described by the expression

where *τ* = *t* – *τ*_{D} is the time after transcription is switched on, *α*_{p} *Pα*_{x}*α*_{y}/(*γ* + *λ*_{x}) is a lumped constant giving the protein synthesis rate in fluorescence units per minute [FU/min], and *Z*_{0} is a constant determined by the initial conditions. Here, the first two terms in parentheses describe transients associated with the equilibration of the GFP maturation process and the mRNA degradation reaction, respectively, i.e., their contributions decay exponentially with time constants *τ*_{m} and In the long-time limit, the last, exponentially increasing term is dominant. It reflects the constant protein production from an exponentially growing number of plasmids, and describes the long-time behavior of the total fluorescence per cell. However, since we study the dynamics of gene expression during the first cell cycle after induction, all terms, including the transients, are relevant.

From the previous section, we conclude that the parameter *α*_{p}, comprising the plasmid copy number and the protein expression rate, captures most of the cell-to-cell variation within the expression module. To fit the model in Eq. 1 to the single-cell induction kinetics, we therefore fixed the remaining parameters to their population-averaged values. Hence, in the optimization procedure of the fit, we only allow the adjustment of *α*_{p} and the uptake-induced delay *τ*_{D}, which we sought to extract. Note that this choice fixes all relevant timescales governing the dynamics in Eq. 1 and the free parameters only impose shifts in the onset (*τ*_{D}) and in the absolute magnitude (*α*_{p}) of *gfp*-expression.

We fitted the time series of cells induced with various levels of arabinose (0.2%, 0.05%, 0.02%, and 0.01%). A few representative fitted curves for the highest and lowest concentration are plotted in Fig. 2 as solid lines. The resulting histograms for the delay time are shown in Fig. 5 *a*. For the lowest arabinose level (0.01%, *upper panel*) we find that the delay times are distributed between 5 and 50 min with a mean and standard deviation of *τ*_{D} = 23 min and min, respectively. With increasing arabinose concentration, both the mean and the standard deviation of the delay time distribution decrease gradually, until at the highest arabinose level (0.2%, *lower panel*), a distribution with *τ*_{D} = 4.1 min and min is reached.

To test whether there is a relationship between the delay time and the protein synthesis rate, we calculated their cross-correlation coefficients for all inducing arabinose levels (see Fig. 6 *a*). Only in the case of 0.02% arabinose a slight anticorrelation was detected, whereas for all other concentrations the correlation coefficient is close to zero (*p*-values for finding the observed correlation coefficients by chance in an uncorrelated sample: 0.68 for 0.01% ara, 0.03 for 0.02% ara, 0.73 for 0.05% ara, and 0.72 for 0.2% ara). We also find that the distribution of *gfp*-expression rates itself does not vary systematically with the inducing arabinose concentration, and all distributions fall on top of each other when rescaled by their mean values, see Fig. 6 *b* (pairwise Kolmogorov-Smirnov tests yield significance levels between 0.57 and 0.97 for the null hypothesis that the data sets are drawn from the same underlying distribution). In summary, the low correlations between *τ*_{D} and *α*_{P} on the one hand, and the independence of *α*_{P} on the inducing arabinose level on the other hand, suggest that the uptake and the expression module are indeed functionally separate. Note that our experimental approach with time-lapse fluorescence microscopy was crucial for these results, which would have been impossible to obtain with flow cytometry.

### Stochastic model for the uptake module

Next, we want to assess whether the extracted delay time distributions of Fig. 5 *b* may be causally linked to a broad variation in the number of uptake proteins. We approach this question with the help of a simple stochastic model for the uptake-module depicted in Fig. 3 *a*. The model is useful in three respects:

- It serves us to illustrate the mechanism whereby stochastic expression of the uptake protein genes can produce a broad distribution of delay times. We will see that according to this mechanism, the delay time distributions for different inducer concentrations should be related by simple linear rescaling of the time axis. Thus, we will test for this signature of the mechanism in our experimental data.
- Since most model parameters are strongly constrained by literature values, we can test whether an interpretation of our data based on the stochastic model is consistent with these constraints.
- Independent of the precise choice of parameter values, which affect the average delay time and its standard deviation, the model predicts a certain shape for the delay time distribution. We will test whether this shape is compatible with our data.

There are two distinct transport systems for arabinose uptake, AraE and AraFGH. However, the two systems are coupled, and it was found that arabinose uptake can effectively be described as a single Michaelis-Menten process (39). In the sketch of Fig. 3 *a*, this combined transport system is represented by a single gene *upt*. In addition to the transport, the uptake module of Fig. 3 *a* comprises the activation of AraC by internal arabinose, the subsequent stimulation of transcription by the activated complex, and the translation into functional uptake protein. Within our stochastic model for the uptake module, we describe and simulate all of these processes in standard ways (see Appendix B for details).

Fig. 7, *b* and *c*, show the simulated time-evolution of the level of uptake proteins and the level of internal arabinose upon induction with 0.01% external arabinose for a few representative simulation runs. These trajectories illustrate the mechanism leading to a broad distribution of delay times within our model: Internal arabinose initially accumulates approximately linearly in time, and the accumulation accelerates only after reaching the effective arabinose threshold of *a*_{0} ≈ 50 *μ*M for activation of the *araBAD* and *upt* promoters, which is indicated by the solid horizontal line in Fig. 7 *c*. The time delay, *τ*_{D}, caused by the uptake module is the time required for the internal arabinose concentration to reach this threshold level. The rate of arabinose import, given by the slope in Fig. 7 *c*, is proportional to the number of uptake proteins *n* in Fig. 7 *b*. If arabinose import is fast compared to the timescale of changes in the protein abundance, the delay time is given by the simple relation *τ*_{D} = *a*_{0}/(*v*_{0}*n*), where the arabinose uptake rate per uptake protein, *v*_{0}, depends on the external arabinose concentration. Thus, the distribution of uptake proteins in Fig. 7 *a* directly determines the distribution of import rates, which in turn are inversely proportional to the delay times, resulting in the distribution of delay times shown in Fig. 7 *d*.

*b*and internal arabinose in panel

*c*illustrate that the rate of arabinose uptake

**...**

A simple prediction of this mechanism is that an increase of the uptake velocity *v*_{0} will reduce all delay times within a distribution of cells by the same factor. In other words, the delay time distributions for different arabinose levels (and hence different *v*_{0}) should fall on top of each other upon simple linear rescaling of the time axis (and restoring normalization). In Fig. 5 *b*, we test this prediction on our experimental time delay distributions. We find that after rescaling to the same mean value, the cumulative distributions are congruent with each other. This agreement is also quantitatively supported by pairwise Kolmogorov-Smirnov tests, which test whether the samples are likely to be drawn from the same underlying distribution (the legend to Fig. 5 shows the respective significance levels). Note that the linear scaling of the time axis with 1/*v*_{0} does not imply linear scaling with the arabinose level, since *v*_{0} depends nonlinearly on the external arabinose level (see also further below).

To relate the experimentally observed shape of the distribution to the prediction of the stochastic model, we will now derive an analytical expression for the delay time distribution. To this end, we first consider only intrinsic noise and study the effect of extrinsic noise below. Before the addition of the inducer arabinose, expression of the uptake proteins is a completely random, unregulated process. Following the work of Berg (40) and under the assumptions stated in Appendix B, we find a steady-state distribution *P*(*n*) for the number of uptake proteins *n* of the form

which is sometimes referred to as a negative binomial. Here, the ratio *b* = *ν*_{p}/*λ*_{m} of the translation rate and the mRNA degradation rate corresponds to the typical number of proteins produced from a single mRNA and is also known as the burst size (41). The ratio *μ* = of the basal transcription rate and the protein dilution rate can be interpreted as a dimensionless burst frequency (the number of bursts within the lifetime of a protein). Both parameters determine the mean *n* = *μb* and the variance *δn*^{2} = *n* (1 + *b*) of *P*(*n*). Fig. 7 *a* shows the steady-state distribution *P*(*n*) obtained from our stochastic simulations of the uptake module (*shaded histogram*), together with the analytical expression in Eq. 2 for the same rate constants. The excellent agreement suggests that the assumptions leading to Eq. 2 are all satisfied in the relevant parameter regime.

Next, we study the effect of extrinsic noise which leads to a variation of reaction parameters from cell to cell. An experimental characterization of extrinsic noise in *E. coli* (38) found a typical parameter variation of ~20%. When we adopt this level of extrinsic noise for all parameters in our stochastic simulations, the resulting protein distribution has a significantly larger standard deviation than the distribution in the absence of extrinsic noise, while the mean remains almost unchanged (see Supplementary Material, Fig. S1). However, the protein distribution in the presence of extrinsic noise is still well fitted by Eq. 2, with an increased effective burst size and a reduced effective burst frequency. Keeping this in mind, the following results can be generalized to the realistic scenario where extrinsic fluctuations are present.

To obtain an approximation for the delay time distribution, we assume that arabinose uptake is rapid compared to the typical timescale of changes in the protein abundance. In this adiabatic limit, the delay time is inversely proportional to the current protein abundance in each cell, i.e., *τ*_{D} = *τ*_{0}/*n*, where *τ*_{0} *a*_{0}/*v*_{0} is the time for a single uptake protein to accumulate arabinose to the threshold level *a*_{0}. With this relation, the steady-state uptake protein distribution (Eq. 2) leads to a delay time distribution of the form

where Γ(*x*) is the gamma function. In Fig. 7 *d*, we compare this analytical prediction (*solid line*) to the stochastic simulation (*shaded bars*). The small deviation stems from the fact that the number of uptake proteins is not constant over the period of the time delay. Note that, indeed, if the protein dynamics is much faster than the characteristic time of arabinose uptake (), every cell experiences simply the average abundance of uptake protein *n* and the delay time distribution approaches a sharply peaked function at ~*τ*_{D} = *τ*_{0}*n*^{−1} (data not shown). In our case, ≈ 70 min is much larger than the average delay times, so that the assumption of a constant *n* is sufficiently accurate. The mean and variance of the delay time distribution can be approximated by

(see Appendix B). From these expressions, it is clear that the model has two key parameters, which together determine the mean and width of the delay time distribution: the time required to reach the internal arabinose threshold by a single protein burst, *τ*_{0}/*b*, and the burst frequency *μ*.

Now we test whether the shape of the delay time distribution predicted by the model is quantitatively consistent with our experimental distributions. To this end, we fit the model in Eq. 3 to the data in Fig. 5 *a* by varying the two key parameters identified above. The resulting fits (*solid lines*) display good agreement with the experimental data, as indicated by one-sample KS-tests under the null hypothesis that the samples are drawn from the analytical distribution. The significance levels are 0.50, 0.47, 0.77, and 0.07 for 0.01%, 0.02%, 0.05%, and 0.2% arabinose, respectively. Only in the case of 0.2% arabinose does the test point to a significant difference between the theoretical and experimental distribution. However, for this concentration the estimated delay times are very short, such that the error of the estimation itself is likely to account for the deviations. Note that the two-parameter fit guarantees that the mean and standard deviation of the experimental and theoretical distribution will match. However, the fact that the shape of the distributions shows excellent agreement is a nontrivial result, suggesting that the discussed delay mechanism can indeed explain our observations.

Finally, we address the consistency of the parameter values. Fig. 8 shows the estimated parameters as a function of the external arabinose concentration. The timescale *τ*_{0}/*b* of arabinose accumulation in Fig. 8 *a* decreases monotonically as a function of external arabinose and saturates for large sugar abundances, whereas the burst frequency *μ* in Fig. 8 *b* is constant for all arabinose levels. This observation is consistent with the idea that the underlying protein distribution, characterized by *μ* and *b*, is independent of the externally provided sugar concentration, and that the differences in timing can be explained by shifts in the effective arabinose uptake velocity per uptake protein, *v*_{0}: By assuming simple Michaelis-Menten saturation kinetics for *v*_{0}, one expects that *τ*_{0} scales inversely with the external arabinose concentration [*a*_{ex}], i.e., where *v*_{max} denotes the maximal uptake velocity per uptake protein and *K*_{m} the Michaelis constant. This behavior is indeed found in Fig. 8 *a* (*inset*) and with the resulting values for *v*_{max}, *K*_{m} and a typical value of *b* = 30 for the burst factor (41), all parameters are compatible with the experimentally constrained ranges discussed in Appendix B.

## DISCUSSION

We studied the expression dynamics during induction of the bistable arabinose utilization system in single *E. coli* cells using quantitative time-lapse fluorescence microscopy. Upon addition of arabinose, we observed a characteristic time delay before the cells switched from a state of basal expression to a state of high expression of the *ara* regulon. This typical duration of the delay exhibited a systematic dependence on the externally supplied arabinose concentration: At a saturating arabinose level, we found rapid induction within all cells of the culture, whereas with decreasing levels, we detected a significant broadening and shift of the delay time distribution function. To characterize the cell-to-cell variability in the cellular response, we dissected the system into an uptake module with stochastic behavior, and an expression module which displays virtually deterministic behavior in individual cells. We first studied the expression module, in particular by measuring the cell-to-cell distribution of the GFP maturation time. To the best of our knowledge, this constitutes the first measurement of a maturation time distribution in bacteria. We then developed a hybrid deterministic/stochastic theoretical model to analyze our experimental data. The model is based on the assumption that the initial basal expression of the arabinose transporters determines the rate of arabinose uptake. Adopting the approach of Berg, we find an analytic expression for the distribution of transporter proteins and the distribution of delay times. The theory consistently fits the shape of the experimental delay time distributions for various inducer concentrations. Hence our data support a previous conjecture by Siegele and Hu (12), according to which the delay time distribution is causally linked to the distribution of uptake proteins in the absence of the inducer. To corroborate our model even further, it would be interesting to control the level of transporter proteins independently, e.g., by using an inducible promoter that is independent of arabinose. Also, it remains an open question how the two transport systems are coupled. It appears that the high-affinity low-capacity transporter *araFGH* and the low-affinity high-capacity transporter *araE* are orchestrated to respond like a single protein. A similar analysis to ours using knockout mutants in one of the two transport systems could shed light on this matter.

In general, we determined the dynamic response of bacteria to an external change of food conditions. Since such decisions are of vital importance to living systems, we can speculate about their impact on the fitness of a bacterial population. The observed heterogeneous timing in gene induction may simply be a fortuitous consequence of the evolutionary process that shaped the arabinose utilization system in *E. coli*. Alternatively, it may be beneficial for a bacterial colony, if the individual cells respond at different times when arabinose suddenly becomes available in modest amounts. Note that in our experiments with the *araBAD* deficient strain, even the lowest arabinose level, if maintained over a long time, ultimately induces the *ara* system in almost all cells. However, for a wild-type strain in an environment where arabinose availability may fluctuate, temporal disorder of gene induction could provide selective advantages for the colony as a whole. For instance, it might be beneficial to prevent costly synthesis of the arabinose system in all cells when the sugar level is only moderate and may soon be depleted. Our analysis indicates that the delay time distribution of the system can be readily tuned over evolutionary timescales, by adjusting the burst frequency and burst size of the uptake proteins. In the future, it will be interesting to further explore the possible connections between the system design in individual cells and the biological function at the population level.

## SUPPLEMENTARY MATERIAL

To view all of the supplemental files associated with this article, visit www.biophysj.org.

## Acknowledgments

We are grateful to R. Heermann for construction of the plasmid. We thank T. Hwa for helpful discussions and M. Leisner for careful reading of the manuscript.

This work was supported by the LMUinnovativ project “Analysis and Modeling of Complex Systems”. J.A.M. acknowledges funding by the *Elitenetzwerk Bayern*. Author contributions: J.A.M. carried out the experiments. G.F. performed the simulations and analytical calculations. All authors designed the research and wrote the article.

## APPENDIX A: DETERMINISTIC GFP EXPRESSION MODEL

To extract the intrinsic time delay *τ*_{D} from our single cell expression data, we employ a simple deterministic model that follows the scheme depicted in Fig. 3 *b*. We assume that the transcription rate from the promoter *P*_{BAD} is zero until the internal arabinose threshold for activation of *P*_{BAD} is reached at *t* = *τ*_{D}. Then, the promoter activity jumps to its maximal value *α*_{x}. The corresponding rate-equations for the total abundance of plasmids (*P*), *gfp* mRNA (*X*), immature GFP protein (*Y*), and mature GFP protein (*Z*) per cell, are

with the cell-doubling rate *γ* and the rate for transcription *α*_{x}, translation *α*_{y}, maturation and mRNA degradation *λ*_{x}. Note that the model does not include dilution due to cell growth, since we measured the total fluorescence per cell in our experiments. Therefore the number of plasmids (number of gene copies) increases exponentially in time, keeping the number of genes per volume constant. Solving these equations for *Z*(*τ*) leads to Eq. 1 in the main text.

## APPENDIX B: STOCHASTIC MODEL FOR ARABINOSE UPTAKE

The arabinose uptake module, see Fig. 3 *a*, includes the processes for the uptake of arabinose as well as transcription, translation, and turnover of uptake proteins. In the following we describe the chemical reactions included in the stochastic simulations used to generate Fig. 7 and Fig. S1. We then derive an analytical approximation for the delay time distribution and discuss the experimental constraints on the model parameters.

#### Arabinose uptake

Comparison of arabinose uptake in wild-type strains with *araE* and *araFGH* deletion strains revealed that the two transporters do not operate independently (39). Instead, arabinose transport was best described by a single Michaelis-Menten function. Our model reflects this behavior of the wild-type strain through the use of a single effective uptake protein (referred to as Upt) for arabinose import,

The uptake protein binds external arabinose *a*_{ex} with dissociation constant *K*_{m} and, once bound, translocates it to the cytoplasm at rate *v*_{max}. The effective uptake velocity per uptake protein is hence *v*_{0} = *v*_{max}[*a*_{ex}]/(*K*_{m} + [*a*_{ex}]). Cytoplasmic arabinose is denoted by *a*.

#### Transcriptional regulation

The *P*_{BAD} promoter in the *ara*-regulon is one of the best characterized bacterial promoters: In the presence of internal arabinose, AraC stimulates transcription from *P*_{BAD}, while AraC represses transcription by formation of a DNA loop in the absence of arabinose (11). When exceeding an arabinose threshold of *a*_{0} ≈ 50 *μ*M, the promoter activity of *P*_{BAD} increases cubically with the internal arabinose concentration (26). In contrast to the detailed studies on *P*_{BAD}, less is known about the promoter activity function of the promoters *P*_{E} and *P*_{FGH}, which regulate expression of the transport proteins. Both promoters are also induced by internal arabinose, but lack an upstream AraC-binding site (required for DNA looping) and are not repressed in the absence of arabinose. Consequently, their basal expression level is higher than for *P*_{BAD} and the fold-change is reduced from ~400 for *P*_{BAD} to ~150 for *P*_{E} and *P*_{FGH} (42). However, the detailed promoter activity as a function of internal arabinose is not known for these promoters. Apart from the lack of the AraC binding site required for DNA looping, the promoters *P*_{E} and *P*_{FGH} display a high similarity to *P*_{BAD} (43). Therefore we model transcriptional regulation of the uptake proteins by introducing a heuristic promoter *P*_{upt}, which has the same characteristics as *P*_{BAD}, but lacks the repression in the absence of arabinose. To reproduce the cubic increase of the promoter activity function of *P*_{BAD} we allow three arabinose molecules to bind AraC with dissociation constant *K*_{C}. This activated complex binds the promoter *P*_{upt} with dissociation constant *K*_{P} and thereby switches the transcription rate from its basal rate to its maximal rate *ν*_{m}. The chemical reactions for transcriptional regulation are

Here the concentration of AraC molecules [C] is a variable that changes little over time (43) and is therefore assumed to be a constant parameter in our model. In steady state, the probability for finding the promoter *P*_{upt} in a transcriptionally activated state is a Hill function of the internal arabinose concentration, We define the effective arabinose threshold for activation of *P*_{upt} as

#### Translation and turnover

mRNA is translated into functional uptake protein at rate *ν*_{p} and gets degraded at rate *λ*_{m}. In contrast, the uptake proteins and arabinose are only diluted by cell growth at doubling rate *γ*:

#### Delay time distribution

Following Berg (40), we derive an analytical approximation for the delay time distribution of our stochastic model. In the absence of arabinose, transcription of the gene for the uptake protein takes place at its basal rate Neglecting operator state fluctuations (44), the probability to observe *m* transcription events up to time *t* follows a Poisson distribution

with mean and variance In the limit of short mRNA lifetime compared to the protein lifetime one can assume instantaneous, geometrically distributed protein bursts from each mRNA molecule. This implies that the probability that *m* mRNA molecules produce *n* proteins follows a negative binomial distribution

where the burst size *b* *ν*_{p}/*λ*_{m} is the average number of proteins produced from one mRNA molecule. Hence, the probability to produce *n* proteins up to time *t* is the weighed sum of negative binomials Setting *t* equal to the protein lifetime yields the steady-state distribution of proteins, and for large *μ* we can replace the Poisson distribution by a *δ*-function located at *m* = *μ*, leading to Eq. 2 in the main text. Applying the transformation rule yields the delay time distribution in Eq. 3 and the moments *τ*_{D} and are determined by the integrals

Here expansion of the integrands up to second order in *δn* = *n* – *n* brings us to the expressions in Eq. 4.

#### Parameter values

The effective arabinose threshold *K*_{upt} ≈ 50 *μ*M and the promoter binding constant *K*_{P} = 10 nM are chosen similar to the parameters of *P*_{BAD} (26,45). This choice determines the ratio (see above) and by choosing a typical value of [C] = 100 nM we obtain *K*_{C} = 10^{6} *μ*M^{3}. For the maximal promoter activity we set a typical value for the promoters in the *ara*-regulon, *ν*_{m} = 5 mRNA/min, which was derived from the mRNA steady-state levels reported in Johnson and Schleif (43). With a promoter fold-change of 150 similar to *P*_{E} and *P*_{FGH} (42), the basal transcription rate is expected to be mRNA/min . From our fits of Eq. 3 to the experimental delay time distributions we obtained an average value of *μ* = With our protein dilution rate of *λ*_{p} = *γ* = ln(2)/(50 min) (from our measurement of the growth rate, see main text), this yields a basal expression rate of mRNA/min—in good agreement with the biochemical constraints stated before. The mRNA degradation rate *λ*_{m} is set according to a half-life of 2 min (43), allowing us to adjust the translation rate *ν*_{p} to match a typical burst factor of *b* = 30 (41). The *K*_{m} for arabinose uptake is in wild-type cells at ~50 *μ*M (39), and the maximal uptake rate per uptake protein, *v*_{max}, can be estimated from bulk measurements in which the uptake rate per total cellular dry mass was determined (39). By assuming a dry mass of 3 × 10^{−13} g per cell (46) and ~10^{3}−10^{4} uptake proteins per cell (47), we end up with *v*_{max} = 200–2000 arabinose molecules/protein/min. From a Lineweaver-Burk fit to the data in Fig. 8 *b* we obtained *v*_{max} ≈ 120 molecules/protein/min and *K*_{m} = 2.8 mM. While the value for *v*_{max} is compatible with the biochemical constraints, our *K*_{m} differs by two orders of magnitude from the previously reported value of 50 *μ*M (39). For such a small Michaelis constant, all arabinose concentrations used in our experiments would saturate the uptake system completely and hence there should be no difference in timing of gene induction. However, the experimental conditions of Daruwalla et al. (39) differ from ours; in particular, the proton gradient between periplasm and cytoplasm, which drives the arabinose/*H*^{+} symport by AraE, is limited by oxygen availability (48). For the case of the lactose/*H*^{+} symporter LacY, it has been shown that a reduced proton gradient leads to an increase of the apparent *K*_{m} (49). Hence, oxygen limitation in our microfluidic setup could explain the observed discrepancy.

#### Stochastic simulations

Although in the rate equations above only the equilibrium constants are depicted, we took for the dynamical simulations all association and dissociation processes explicitly into account. As a conservative assumption, all association rates were chosen 10-fold smaller than the diffusion-limited on-rate of 2 nM^{−1} min^{−1} for a typical transcription factor in *E. coli* (50) and the dissociation rates were adjusted according to the respective equilibrium constant. The trajectories in Fig. 7, *b* and *c*, correspond to single kinetic Monte Carlo simulations (51) for 0.01% external arabinose. The protein and delay-time distributions in Fig. 7, *a* and *d* (*solid lines*), were obtained from 5 × 10^{4} independent simulation runs with the same parameters.

## Notes

Judith A. Megerle and Georg Fritz contributed equally to this work.

Editor: Herbert Levine.

## References

*Escherichia coli*by constitutive expression of the low-affinity high-capacity AraE transporter. Part 12. Microbiology. 147:3241–3247. [PubMed]

*Escherichia coli*araBAD promoter by use of a lactose transporter of relaxed specificity. Proc. Natl. Acad. Sci. USA. 99:7373–7377. [PMC free article] [PubMed]

*in vitro*mutagenesis system.

*In*Promega Technical Manual. Promega Corporation, Fitchburg, WI.

*Escherichia coli*expression vectors having pBR322 copy control. Plasmid. 55:152–157. [PubMed]

*Escherichia coli*at single-gene resolution using two-color fluorescent DNA microarrays. Proc. Natl. Acad. Sci. USA. 99:9697–9702. [PMC free article] [PubMed]

*Escherichia coli*: unusual sensitivity of the RNA transcript to RNase E activity. Genes Dev. 9:84–96. [PubMed]

*Escherichia coli*and

*Klebsiella pneumoniae*cells. J. Bacteriol. 146:377–384. [PMC free article] [PubMed]

*Escherichia coli*: catabolite repression, autoregulation, and effect on araBAD expression. Proc. Natl. Acad. Sci. USA. 81:4120–4124. [PMC free article] [PubMed]

**The Biophysical Society**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (872K) |
- Citation

- Negative auto-regulation increases the input dynamic-range of the arabinose system of Escherichia coli.[BMC Syst Biol. 2011]
*Madar D, Dekel E, Bren A, Alon U.**BMC Syst Biol. 2011 Jul 12; 5:111. Epub 2011 Jul 12.* - Mathematical modeling of the low and high affinity arabinose transport systems in Escherichia coli.[Mol Biosyst. 2012]
*Yildirim N.**Mol Biosyst. 2012 Apr; 8(4):1319-24. Epub 2012 Feb 7.* - Regulatable arabinose-inducible gene expression system with consistent control in all cells of a culture.[J Bacteriol. 2000]
*Khlebnikov A, Risa O, Skaug T, Carrier TA, Keasling JD.**J Bacteriol. 2000 Dec; 182(24):7029-34.* - AraC protein, regulation of the l-arabinose operon in Escherichia coli, and the light switch mechanism of AraC action.[FEMS Microbiol Rev. 2010]
*Schleif R.**FEMS Microbiol Rev. 2010 Sep; 34(5):779-96. Epub 2010 Apr 8.* - AraC protein: a love-hate relationship.[Bioessays. 2003]
*Schleif R.**Bioessays. 2003 Mar; 25(3):274-82.*

- Multi-Level Kinetic Model of mRNA Delivery via Transfection of Lipoplexes[PLoS ONE. ]
*Ligon TS, Leonhardt C, Rädler JO.**PLoS ONE. 9(9)e107148* - Adaptive Imaging Cytometry to Estimate Parameters of Gene Networks Models in Systems and Synthetic Biology[PLoS ONE. ]
*Ball DA, Lux MW, Adames NR, Peccoud J.**PLoS ONE. 9(9)e107087* - Messages Do Diffuse Faster than Messengers: Reconciling Disparate Estimates of the Morphogen Bicoid Diffusion Coefficient[PLoS Computational Biology. ]
*Sigaut L, Pearson JE, Colman-Lerner A, Ponce Dawson S.**PLoS Computational Biology. 10(6)e1003629* - Chemical warfare and survival strategies in bacterial range expansions[Journal of the Royal Society Interface. 201...]
*Weber MF, Poxleitner G, Hebisch E, Frey E, Opitz M.**Journal of the Royal Society Interface. 2014 Jul 6; 11(96)20140172* - Single Cell Kinetics of Phenotypic Switching in the Arabinose Utilization System of E. coli[PLoS ONE. ]
*Fritz G, Megerle JA, Westermayer SA, Brick D, Heermann R, Jung K, Rädler JO, Gerland U.**PLoS ONE. 9(2)e89532*

- CompoundCompoundPubChem Compound links
- GeneGeneGene links
- GEO ProfilesGEO ProfilesRelated GEO records
- PubMedPubMedPubMed citations for these articles
- SubstanceSubstancePubChem Substance links
- TaxonomyTaxonomyRelated taxonomy entry
- Taxonomy TreeTaxonomy Tree

- Timing and Dynamics of Single Cell Gene Expression in the Arabinose Utilization ...Timing and Dynamics of Single Cell Gene Expression in the Arabinose Utilization SystemBiophysical Journal. Aug 15, 2008; 95(4)2103

Your browsing activity is empty.

Activity recording is turned off.

See more...