![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||
Copyright : © 2006 Mayo et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Plasticity of the
cis-Regulatory Input Function of a Gene
1Departments of Molecular Cell Biology and Physics of Complex Systems, The Weizmann Institute of Science, Rehovot, Israel Arthur D. Lander, Academic Editor University of California Irvine, United States of America Corresponding author.Uri Alon: urialon/at/weizmann.ac.il Received January 24, 2005; Accepted December 8, 2005. This article has been cited by other articles in PMC.Abstract The transcription rate of a gene is often controlled by several regulators that bind specific sites in the gene's
cis-regulatory region. The combined effect of these regulators is described by a
cis-regulatory input function. What determines the form of an input function, and how variable is it with respect to mutations? To address this, we employ the well-characterized
lac operon of
Escherichia coli, which has an elaborate input function, intermediate between Boolean AND-gate and OR-gate logic. We mapped in detail the input function of 12 variants of the
lac promoter, each with different point mutations in the regulator binding sites, by means of accurate expression measurements from living cells. We find that even a few mutations can significantly change the input function, resulting in functions that resemble Pure AND gates, OR gates, or single-input switches. Other types of gates were not found. The variant input functions can be described in a unified manner by a mathematical model. The model also lets us predict which functions cannot be reached by point mutations. The input function that we studied thus appears to be plastic, in the sense that many of the mutations do not ruin the regulation completely but rather result in new ways to integrate the inputs.
Introduction Much of the computation performed by transcription networks occurs in the DNA
cis-regulatory region (CRR) of each gene. Most genes are regulated by multiple regulators (inputs) that bind their CRR. The way that these inputs combine to determine the rate of transcription is described by the
cis-regulatory input function (CRIF) of the gene. Well-studied examples include input functions that govern developmental genes [
1–
4] at specific locations and times, when certain combinations of regulators are active. The CRIFs are often described using Boolean functions such as AND- and OR-logic gates [
4–
15], although graded [
8,
16–
18] input functions are also known to occur.
Recently, a high-resolution map of the CRIF of a well-characterized gene system, the
lac operon [
19–
21] of
Escherichia coli, was obtained, using accurate gene-expression measurements from living cells [
8]. The
lac CRIF has two inputs, corresponding to the two regulators of the system, cAMP receptor protein (CRP) and LacI. The CRIF was found to be a rather intricate function, intermediate between Boolean AND-gate and OR-gate logic [
8] (see
Figure 1
Here, we ask which changes in a CRIF can be caused by a few point mutations in the regulatory region and which changes cannot. This question is related to the way in which the input function can be shaped by evolutionary selection [
1]. It is believed that gene networks can “learn” new computations on an evolutionary timescale by means of mutations [
22–
25]. Changes are mainly due to point mutations, gene duplications, and rearrangements [
26–
28]. The degree to which mutations can change the computation, without ruining the essential function, may be termed “plasticity” [
1,
29–
32]. The larger the plasticity, the more readily a network can learn new computations in a new environment.
To address this, we study the plasticity of the
lac input function. We measured the effects of point mutations in the
lac promoter region on its input function. We find that the
lac input function is quite plastic: even a few point mutations can significantly change the CRIF, leading to input functions that resemble pure AND gates, OR gates, and single-input switches. A mathematical model explains these results and lets us predict which types of gates can and cannot be obtained with point mutations.
Results Library of Variants of the
lac CRR
To study the effects of point mutations in the regulator binding sites of the
lac CRR, we constructed a random library of CRR mutants. The library was based on the 113-bp regulatory region of the
lac operon from wild-type
E. coli. Each CRR variant contained between three to nine point mutations in selected locations in the regulator binding sites (
Figure 2
Diverse Input Functions Are Generated by a Few Point Mutations To measure the CRIF of each CRR variant, we grew the corresponding reporter strain inside a multiwell fluorimeter in defined glucose medium supplemented with 88 combinations of the two inducers cAMP and IPTG. The CRIF describes the promoter activity in the various inducer combinations. The promoter activity, which corresponds to the rate of GFP production per cell, was measured over the exponential phase of growth. The estimated experimental mean relative error based on day–day repeats was about 15%. We find diverse input functions in our library. Two examples, as well as the CRIF of the wild-type promoter region, are shown in
Figure 4
As shown in
Figure 4 Mathematical Model We used a model for the
lac CRIF based on the equilibrium binding of RNAp and the two regulators, CRP and LacI, to the
lac promoter region. In our previous study, this model was found to describe well the wild-type CRIF [
8]. The mathematical model describes the system at the level of effective binding affinities of the regulators. The three parameters in the model that define the interactions of CRP, LacI, and RNAp with their DNA sites are denoted
d, c, and
a. The mutant variants in this study correspond to input functions in which these affinity parameters are varied with respect to the wild-type input function.
Parameter Space and Phenotype Space The model allows a convenient description of the range of CRIFs that can be reached by point mutations in the CRR. The three parameters that define the interactions of the regulators and RNAp with their DNA sites,
a, c, and
d can be used to define a 3-D parameter space of possible CRR variants (
Figure 5
Each point in this parameter or “genotype” space corresponds to a specific CRIF “phenotype.” To describe the space of phenotypes, note that each CRIF can be described by the ratios of the three plateaus (plateaus I, II, and III defined in
Figure 3 In contrast to forbidden CRIFs, some CRIF forms lie in dense regions of the design space and, thus, can be readily reached by point mutations in the CRR. These functions include AND gates, OR gates, and single-input switches. The CRR variants in the present study are represented in phenotype space in
Figure 5 Discussion The effect of point mutations on the input function of a promote region was studied. We found that a few point mutations can change the input function significantly, resulting in AND-like gates, OR-like gates, and single-input switches. The observed CRIF variants can be explained in a unified way by means of a mathematical model. The model explains which gates cannot be reached by point mutations in the regulator sites. The mathematical model allows depiction of the mapping between parameter space and phenotype space (or CRIF space) (
Figure 5 The wild-type CRIF appears to be easily changed by mutations into new functions. It is plastic in the sense that many mutations do not ruin the input function completely, but rather result in a new computation, a new way to integrate the inputs. A CRIF that can access potentially useful computations with few mutations can readily adapt in case the environmental conditions change [
1]. Not all new computations can be learned, however, with point mutations in the
cis-regulatory sites of this promoter. The range of functions that can be reached by simple point mutations is strongly constrained by the structural form of the CRR and its input regulators. For example, input functions in which plateau IV is lower than plateau I, II, or III cannot be reached. Similarly, input functions in which plateau I is higher than any other plateau cannot be reached. The set of forbidden functions include NAND, NOR, XOR, and EQUAL gates, and others (ten of the 16 possible two-input logic functions are forbidden;
Figure 1 The present approach can also be used to construct and characterize new input functions. New input functions are useful for the design of synthetic gene circuits made of well-characterized transcription factors [
24,
35–
50]. Most synthetic circuits built so far have promoters with only a single-input regulator. Addition of multi-input functions [
5,
51] could significantly strengthen the computational power of synthetic gene circuits and mimic real biological design [
52,
53]. One limitation is our current lack of understanding of the precise mapping between the DNA sequence of a binding site and the model parameters that describe its effective in vivo affinity. For example, we find that some of the changes in the CRIFs do not correlate in a simple way with the distance of the mutated binding sites from their consensus sequences. That is, in some cases a mutation moved a site closer to its consensus sequence, whereas the model affinity parameter was predicted to be lower for the mutated CRR than for wild type (unpublished data). Indeed, the in vivo affinity and efficacy of a regulator may depend on the sequence context outside of its site. This means that in some cases we may not be able to fully predict in vivo parameters based solely on the DNA sequence of the CRR, requiring an empirical search similar to the present study.
Our main finding is that a few mutations can change an input function significantly, resulting in AND-like gates, OR-like gates, and single-input switches. The present study thus relates to the question of how input functions are shaped by evolutionary selection. This question may also be further studied experimentally [
22,
23,
25,
54,
55] by evolving bacteria in defined environments that favor different input functions. It would also be interesting to study how input functions vary between species that live in different environments. The present experimental and theoretical approach could be readily extended to study plasticity in other gene systems.
Materials and Methods Plasmids and strains Promoter activity was measured using low-copy plasmids [
8] that report for transcription rate of a fast-folding GFP reporter from the CRR of interest. The wild-type
lac CRR of
E. coli K12 strain MG1655 was used as a basis for mutations. Variants of the CRR were generated by custom synthesis (BaseClear, Leiden, The Netherlands) of a 113-bp DNA fragment with the sequence (genomic coordinates 365438–365669): AAW
TGTGAGC
GCAACGCAAT
TAATGTGAGT
TAGCTCACWW H
TTAGGCACC
CCAGGCTWTA
CACTTTATGC
TTCCGGCTCG W
ATGTTGTGT
GGAATTGTGA
GCGGATAACA ATT, where W at positions 3, 39, 40, 57, and 81 is A or T with equal probability and H at positions 41 is A, T, or C with equal probability. The CRR library was cloned into pU66 [
8] and transformed into MG1655. Colonies were isolated and the CRR was fully sequenced. One of the CRR variants (U339) was synthesized (GenScript, Piscataway, New Jersey, United States) to also have the O3 sequence replaced by the O1 sequence. Reporter strains are listed in
Table 1.
Culture and growth conditions Cultures (1 ml) inoculated from single colonies were grown for 16 h in M9C defined medium (M9, 2 mg/ml glucose, 1 mM MgSO
4, 0.1 mM MgCl
2, 25 μg/ml kanamycin) at 37 °C with shaking at 250 rpm. To map each CRIF, the cultures were diluted to OD
600 = 0.003 into M9C with different concentrations of cAMP (0–20 mM, Sigma, St. Louis, Missouri, United States) and IPTG (0–200 μM), at a final volume of 150 μl per well in a flat-bottomed 96-well plate (Sarstedt, Beaumont Leys, United Kingdom). The cultures were covered with 100 μl of mineral oil (Sigma) to prevent evaporation and grown for about 18 h in a WallacVictor2 multiwell fluorimeter (PerkinElmer, Wellesley, California, United States) at 37 °C, set with an automatically repeating protocol of shaking and OD
600 and fluorescence readings [
8]. Time between repeated measurements was 6 min. In order to correct for the differences in growth rates (especially due to the different concentrations of cAMP), background fluorescence at a given OD was determined from the fluorescence of cells bearing a promoterless GFP vector at the same OD and at the same cAMP concentrations (total of 12 different control conditions. IPTG was not found to have a large effect on the cell growth rate, data available on request). Cells growing on glucose, with saturating external cAMP, and cells growing on glycerol (high endogenous cAMP), without exogenous cAMP, show similar
lac promoter activity and growth rates. The rate of GFP production, divided by the OD at midexponential growth, provided a measure of the promoter activity: PA = dGFP/dt/OD [
8]. Note that promoter activity measurement takes dilution by growth into account because if GFP is produced at rate β per cell per unit time, and there are
N(t) cells then dGFP/dt = β
N(t). At all conditions, the promoter activity achieved an approximately constant value during about two cell cycles in midexponential growth. We computed the promoter activity in each of the 88 growth conditions by an average of the promoter activity over these two cell cycles, resulting in the CRIF map. Day-to-day variability in fluorescence and OD data gathered from the instruments was about 10% (both for GFP fluorescence and absorbance) [
56]. The relative error in the promoter activity measurement is about 15%. Each of the variants in our study was mapped in the same conditions and in the same strain as all other variants. Hence, the changes in GFP expression result from the mutations in the promoter. The wild-type
lac system is intact on the chromosome and identical for all variants.
In our previous study [
8], we reported a comparison of the present GFP-reporter plasmid and a direct chromosomal enzyme production assay using a colorimetric substrate to assay the production of LacZ from its endogenous locus. The
lac system has classically been studied by using such colorimetric assays for the
lacZ gene product. The present GFP-reporter plasmid measurement is different in several ways from assays of enzyme activity from the chromosomally encoded operon: (i) the low-copy plasmid (pSC101 origin) introduces several extra copies of the promoter region, thus potentially titrating out LacI; (ii) the promoter region on the plasmid lacks the O2 binding site (+411 in the lacZ coding region), a site whose absence makes shutoff of the promoter about 5-fold weaker; and (iii) the plasmid DNA may be harder to loop, reducing repression strength. The previous experiments [
8] with the colorimetric assay, using accurate time-resolved ONPG absorbance measurements, indicated that the input functions found by the two methods are qualitatively similar with four plateaus and four threshold levels. However, some of the plateaus were deeper in the ONPG assay. This difference presumably reflects the above-mentioned plasmid effects.
We also measured cell–cell variability in GFP expression using flow cytometry with a narrow gate on side- and forward-scatter (Figure SOM 1A in
Protocol S1). We found that in the present conditions the GFP distributions were single peaked, and few all-none effects [
57,
58] were discerned (Figure SOM 1B in
Protocol S1).
Mathematical model Promoter activity was modeled based on equilibrium binding of the regulators CRP and LacI as described in [
8]. The promoter activity is:
where CRP activity (fraction of CRP bound to cAMP) is
A = [CRP–cAMP]/[CRP
T] =
X
n/
(
1 + X
n), in which
X = [cAMP]/K
cAMP is cAMP concentration in units of its dissociation constant for CRP. Cooperativity is described by the Hill coefficient
n. Similarly, the fraction of LacI not bound to IPTG is
R = [LacI-free]/[LacI
T] =
1/(
1 + Y
m), with
Y = [IPTG]/K
IPTG, where
m is the Hill coefficient. Three parameters,
a, c, and
d define the binding affinity of the regulators RNAp to their sites:
a = [RNAp]/K
P is RNAp concentration in units of its dissociation constant to a free promoter when cAMP–CRP is not bound to the CRR,
c = [LacI]/K
R is LacI concentration in units of its dissociation constants to its site, and
d = [CRP]/K
C, is CRP concentration in units of the dissociation constant for binding to its site. The stabilization of RNAp binding by CRP is given by the ratio of its affinity without and with cAMP–CRP binding: η = K
P/K
CP (using the notation of [
8] Equation 10, η =
b/a. Note the typo with 2
b instead of
b in the corresponding equation in [
8]). Finally, α and γ are the maximal and basal transcription rates. We note that the present experimental data do not appear to be sufficient to obtain unique fits to all of the model parameters, without further measurements (such as direct estimates of the parameter η).
Protocol S1: Plasticity of the CRIF of a Gene (171 KB PDF) Click here for additional data file.(172K, pdf) Acknowledgments We thank M. Surette, M. Elowitz, G. Yagil, D. Tawfik, M. Kirschner, A. J. Ninfa, and all members of our lab for discussions.
Funding. We thank the National Institutes of Health, Minerva, Human Frontiers Science Program, and Israel Science Foundation for support. AEM acknowledges a Pacific Theaters Foundation postdoctoral grant.
Competing interests. The authors have declared that no competing interests exist.
Abbreviations
Footnotes
Author contributions. AEM, YS, and UA conceived and designed the experiments, performed the experiments, and analyzed the data. SS and AZ contributed reagents/materials/analysis tools. AEM and UA wrote the paper.
Citation: Mayo AE, Setty Y, Shavit S, Zaslaver A, Alon U (2006) Plasticity of the cis-regulatory input function of a gene. PLoS Biol 4(4): e45. ¤ Current address: Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, United States of America References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||
Science. 1998 Mar 20; 279(5358):1896-902.
[Science. 1998]Nat Rev Genet. 2004 Mar; 5(3):169-78.
[Nat Rev Genet. 2004]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]Cell. 2004 Jun 11; 117(6):713-20.
[Cell. 2004]J Bacteriol. 2004 Nov; 186(22):7618-25.
[J Bacteriol. 2004]J Mol Biol. 1961 Jun; 3():318-56.
[J Mol Biol. 1961]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]Nat Rev Genet. 2003 Jun; 4(6):457-69.
[Nat Rev Genet. 2003]Nature. 2005 Jul 28; 436(7050):588-92.
[Nature. 2005]Genetica. 1999; 107(1-3):171-9.
[Genetica. 1999]Nat Genet. 2003 Jul; 34(3):264-6.
[Nat Genet. 2003]BMC Evol Biol. 2004 Mar 8; 4():9.
[BMC Evol Biol. 2004]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]EMBO J. 1999 Aug 2; 18(15):4299-307.
[EMBO J. 1999]Proc Natl Acad Sci U S A. 2003 Apr 29; 100(9):5136-41.
[Proc Natl Acad Sci U S A. 2003]Microbiol Rev. 1991 Sep; 55(3):371-94.
[Microbiol Rev. 1991]Proc Natl Acad Sci U S A. 2002 Dec 24; 99(26):16587-91.
[Proc Natl Acad Sci U S A. 2002]Nature. 2000 Jan 20; 403(6767):335-8.
[Nature. 2000]Nature. 2005 May 5; 435(7038):118-22.
[Nature. 2005]Science. 1994 Sep 23; 265(5180):1863-6.
[Science. 1994]Proc Natl Acad Sci U S A. 2004 Apr 27; 101(17):6355-60.
[Proc Natl Acad Sci U S A. 2004]Nat Rev Genet. 2003 Jun; 4(6):457-69.
[Nat Rev Genet. 2003]Nature. 2002 Nov 14; 420(6912):186-9.
[Nature. 2002]Nature. 2005 Jul 28; 436(7050):588-92.
[Nature. 2005]J Biol. 2003; 2(2):14.
[J Biol. 2003]Proc Natl Acad Sci U S A. 2002 Dec 10; 99(25):16144-9.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]Nat Genet. 2004 May; 36(5):486-91.
[Nat Genet. 2004]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 1957 Jul 15; 43(7):553-66.
[Proc Natl Acad Sci U S A. 1957]Nature. 2004 Feb 19; 427(6976):737-40.
[Nature. 2004]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7702-7.
[Proc Natl Acad Sci U S A. 2003]