• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Aug 15, 2003; 31(16): 4689–4695.
PMCID: PMC169978

Identification and analysis of ‘extended –10’ promoters in Escherichia coli

Abstract

We have compiled and aligned the DNA sequences of 554 promoter regions from Escherichia coli and analysed the alignment for sequence similarities. We have focused on the similarities and differences between promoters that either do or do not contain an extended –10 element. The distribution of –10 and –35 hexamer element sequences, the range of spacer lengths between these elements and the frequencies of occurrence of different nucleotides, dinucleotides and trinucleotides were investigated. Extended –10 promoters, which contain a 5′-TG-3′ element, tend to have longer spacer lengths than promoters that do not. They also tend to show fewer matches to the consensus –35 hexamer element and contain short runs of T residues in the spacer region. We have shown experimentally that the extended –10 5′-TG-3′ motif contributes to promoter activity at seven different promoters. The importance of the motif at different promoters is dependent on the sequence of other promoter elements.

INTRODUCTION

In bacteria, gene expression is dependent on transcription by a single RNA polymerase. For transcription, the bacterial RNA polymerase must be recruited to promoters and transcription initiation orchestrated. A large number of promoter contacts are made by RNA polymerase during initiation, and most sequence-specific contacts are made by the σ subunit (13). The C-terminal domain of the RNA polymerase α subunit can also make sequence-specific contacts with UP elements, which are usually located 40–60 bp upstream from the transcription start point (4,5). Transcription initiation of most genes in Escherichia coli is driven by RNA polymerase containing σ70 rather than one of the six alternative σ factors. The most highly conserved promoter elements are the –10 and –35 hexamers, located 10 and 35 bp upstream from the transcription start point. For RNA polymerase containing σ70, consensus sequences for these elements are 5′-TATAAT-3′ and 5′-TTGACA-3′, respectively, and the spacing between these elements is crucial for promoter function. The optimum spacer region length is 17 nucleotides. In general, deviation from either the consensus sequences or the optimum spacer length leads to a reduction in promoter activity (69).

Escherichia coli σ70-dependent promoters also contain a less well conserved promoter element, the ‘extended –10’ element. This was originally identified in Bacillus subtilis where, as with other Gram-positive bacteria, it is more conserved than in E.coli (1014). The extended –10 element is located one base upstream of the –10 hexamer, with the major determinant, 5′-TG-3′, positioned at –15/–14 with respect to the transcription start point. We reported previously that the extended –10 promoter element is present in ~20% of all E.coli promoters (15). In addition to the 5′-TG-3′ determinant, in Gram-positive bacteria there is extended sequence conservation to give the consensus 5′-TRTG-3′, known as the –16 region (16,17).

Previous work from this laboratory has focused on transcription initiation at promoters containing the extended –10 element, using the galP1 promoter and several synthetic derivatives as models (15). In this work, we first analysed the sequences of 554 E.coli promoters and identified those promoters that contain an extended –10 5′-TG-3′ motif. We examined other conserved sequence elements, and investigated whether extended –10 promoters have different patterns of sequence conservation compared with non-extended –10 promoters. We also checked whether 5′-TRTG-3′ sequences were conserved. We then showed experimentally that, for several naturally occurring extended –10 promoters, the 5′-TRTG-3′ motif is an important determinant for promoter activity.

MATERIALS AND METHODS

Promoter compilation and analysis

The starting point was the compilation of 300 E.coli promoter sequences assembled by Lisser and Margalit (18), with the addition of 54 promoters from original papers cited by Ozoline et al. (19). The 1983 Hawley and McClure (6) and 1987 Harley and Reynolds (7) compilations were also consulted. The final major source of DNA sequences came from the inspection of references cited in the E.coli K-12 Linkage Map (20). We originally identified over 800 promoters, but only 554 of these had a mapped transcription start and an identified –10 region, thereby qualifying them for inclusion in our analysis (the compilations that we used are available at http://www.biosciences.bham.ac.uk/labs/minchin/mitchell2003/). The position and sequences of the –10 and –35 hexamer elements for each promoter were taken from the original literature, which may have assumed that the promoter is recognised by RNA polymerase containing σ70, therefore some promoters in our list may be recognised by alternative σ factors. Our compilation focuses on the extended –10 promoter element and, therefore, promoters were aligned starting with the last base of the –10 hexamer, which is denoted as position –7. Since this study focuses on the extended –10 region, no attempt was made to align –10 and –35 element sequences by the insertion of gaps within the spacer, and analysis has been restricted to the bases downstream from position –28. The exception is the linked consensus analysis of both the –10 and –35 elements shown in Table Table22 that uses only the 12 nucleotides of the two hexamer elements. Analysis of the –35 hexamer is based on 514 promoters (96 TG extended –10 promoters and 418 without the TG motif) for which the –35 element was identified in the original literature.

Table 2.
Conservation of nucleotides within promoters at –10 and –35 elements

Methods of analysis

The sequences and names of the promoters were initially aligned using Microsoft Excel. Using simple and basic functions within this program, the conservation of the –10 hexamer was calculated base by base. The same program was used to divide the promoters into subgroups, with and without the extended –10 5′-TG-3′ element, and then to probe the relationship and conservation of –35 and –10 regions and the distribution of different spacer lengths. For nucleotide and dinucleotide frequencies within the promoters, sequences were investigated from positions –7 to –28. Microsoft Access was used for the trinucleotide analysis.

Strains, promoters and plasmids

The E.coli K12 strain used throughout this work was DH5α (21). Promoter fragments were amplified from genomic DNA using PCR. Promoter fragments were between 100 and 175 bp in length and included ~30 bp of DNA downstream of the transcription start point. The sequences, between positions –50 and +10, of all the promoters used are listed in Table Table5.5. PCR fragments were cloned into the plasmid pGEM-T Easy. For promoter activity assays, the promoter fragments were subcloned using EcoRI and HindIII restriction sites into the low copy number, broad host range lac expression vector, pRW50, to give a promoter::lac operon fusion (22).

Table 5.
Promoter region sequences and activity relative to the aroF promoter

Mutagenesis and activity measurements for mutant promoters

Promoter mutants were constructed using the QuikChange™ site-directed mutagenesis kit (Stratagene). The 5′-TG-3′ motif was changed to TT, CG and CT in the envA, purEK, purMN and purFP1 promoters. In addition, all possible combinations of bases at positions –15 and –14 of the aroF, ompF and gyrA promoters were constructed. The activities of these different promoters were compared with the activities of derivatives of the synthetic KAB promoter carrying all possible base combinations at positions –15 and –14, previously constructed by Burr et al. (15). The bases at positions –17/–16 in the aroF promoter were also replaced by all possible combinations, and activities were compared with the corresponding set of KAB derivatives (also previously constructed by Burr et al.). All mutant promoters were checked by automated sequencing, provided by the University of Birmingham Genomics Laboratory. To measure promoter activities, DNA fragments carrying different promoters were subcloned into pRW50 and the activity of promoter::lac operon fusions was determined using β-galactosidase assays, as described by Miller (23). To do this, pRW50 plasmid derivatives were transformed into DH5α cells, and pre-cultures and assay cultures were grown in Lennox Broth (LB) plus tetracycline (35 µg/ml). Each assay was performed independently three times.

RESULTS AND DISCUSSION

Analysis of 554 promoter regions: –10 elements

Previous authors have searched for sequence patterns in different collections of bacterial promoters, e.g. O’Neill (24) and Galas et al. (25). In this study, we have focused on the analysis of promoters that contain the extended –10 motif, i.e. the sequence 5′-TG-3′ at positions –15/–14, located 1 bp upstream from the –10 hexamer (promoters containing this element are referred to here as TG promoters). We identified 106 such promoters. We have compared the sequence conservation of these 106 TG promoters with the sequence conservation of the 448 non-TG promoters (i.e. those lacking an extended –10 motif).

First, we considered promoter –10 hexamer sequences (Table (Table1).1). Overall, the conservation of the –10 hexamer according to our alignment is in good agreement with previously published compilations. The base T at position –7 is most highly conserved (90%), followed closely by the base A at position –11 (87%) and the base T at position –12 (79%). Bases at the other three positions are less well conserved. The TG promoters show less conservation within the –10 element compared with their non-TG counterparts. At positions –12 and –10, T occurs 7 and 8.5% less frequently in TG promoters compared with the non-TG counterparts. Thus, the 5′-TG-3′ motif may be compensating for a poor –10 hexamer, or TG promoters may have specific sequence requirements within their –10 element.

Table 1.
Conservation of the –10 hexamer in E.coli promoters

Relationship between –10 and –35 elements

Analysis of the promoter set reveals that 37% of promoters have a –10 hexamer that has a 4/6 match to the consensus and the majority (95%) of promoters have a match of at least 3/6 (Table (Table2).2). Examination of the –35 hexamer in the set shows that most promoters have either a 4/6 or 3/6 match to the consensus (both occur at a frequency of 29%). The –35 hexamer is not as well conserved as the –10 hexamer, with 17% of promoters having a –35 hexamer with a match to the consensus of less than 3/6. Due to the flexibility in the length of the spacer between the –35 and –10 hexamers, and the fact that we are looking for conservation within a window of six residues, a match of at least two nucleotides would be expected by chance. Many promoters that have a poor match to the –35 hexamer appear to compensate by having a better match to the –10 hexamer consensus. A perfect match to the consensus within both the –10 and the –35 hexamer was not found. Thus, a typical E.coli promoter contains a 4/6 match to the consensus within both the –10 and –35 hexamers.

Sequence conservation within the –10 and –35 hexamers of the TG promoters differs from the non-TG promoters. TG promoters appear to be more tolerant of sequence variation within the –10 and –35 hexamers. For example, 26% of TG promoters have a –35 hexamer with a match of less than 3/6 to the consensus compared with 15% for non-TG promoters. Similarly, 38% of TG promoters have a –10 hexamer with three or less bases corresponding to the consensus (24% for non-TG promoters).

Data presented in Table Table33 show that the most frequent spacer distance between promoter –10 and –35 elements is 17 bp, with 44% of all promoters having this spacer length. These data are in agreement with those obtained by Lisser and Margalit (18). Comparison of TG with non-TG promoters shows that a higher proportion of TG promoters have a longer spacer region. Thirty-nine percent of TG promoters have a spacer length of ≥18 bp (26% for non-TG promoters). This is in agreement with the study by O’Neill (24), which analysed sequence conservation as a function of spacer length. Thus, many extended –10 promoters have atypical spacing between their –10 and –35 elements, as well as greater degeneracy in their sequences.

Table 3.
The distribution of spacer length at E.coli promoters

Dinucleotide and trinucleotide analysis of E.coli promoters

Sequence conservation of selected dinucleotides from positions –17 to –7 in our compilation of promoters is shown in Table Table4.4. The dinucleotide 5′-TG-3′ at positions –15/–14 is present in 19% of the promoters and occurs three times more often than predicted by chance (6.25%). It has been suggested that the E.coli extended –10 promoter element is analogous to the –16 promoter region found in Gram-positive bacteria, which has the consensus 5′-TRTG-3′ (16,17). Analysis of the TG promoters shows that 18% have the sequence 5′-TRTG-3′ (10% 5′-TATG-3′; 8% 5′-TGTG-3′), which is marginally more than for non-TG promoters (15%).

Table 4.
Dinucleotide occurrence around the extended –10 element of E.coli promoters

Previous analyses have revealed the presence of short poly(A) or poly(T) tracts in the spacer region of E.coli promoters (26,27). To investigate whether these tracts are preferentially distributed at TG or non-TG promoters, we calculated the percentage of promoters carrying AAA or TTT tracts centred at positions from –8 to –27 at the two sets of promoters. The data presented in Figure Figure11 show that there is a clear preference, in the TG promoter set, for TTT tracts centered at positions –18, –20 and –25. We did not observe conservation of any other dinucleotide or trinucleotide sequences within positions –8 to –27.

Figure 1
Trinucleotide occurrence in E.coli promoter regions. (a) The frequency of occurrence of the trinucleotide AAA on the non-template strand, centred at different locations from positions –7 to –28 in TG promoters (triangles) or non-TG promoters ...

The extended –10 motif is required for transcription from several E.coli promoters

To date, the 5′-TG-3′ motif has been studied in E.coli at a small number of promoters, namely galP1, cysG and phage λ PRE (2831), and is essential for the maximal activity of these promoters. Our sequence analysis shows that the extended –10 motif is found at many other E.coli promoters. Thus, for further study, we selected 11 different promoters, all with a 5′-TG-3′ motif, but with different –10, –35 and UP elements. These promoters showed a range of activities when assayed (Table (Table5).5). The contribution of the extended –10 motif to the activity of the seven most active promoters was investigated experimentally by changing the 5′-TG-3′ motif to 5′-TT-3′, 5′-CG-3′ or 5′-CT-3′. Table Table66 lists the activities of the 21 mutant promoters. The data show that the 5′-TG-3′ motif is essential for optimal activity of all seven promoters. The purEK, purMN, purFP1, aroF and ompF promoters are highly dependent on the motif, whereas envA and gyrA are less dependent. It appears that promoters with poorer matches to the –10 and –35 consensus hexamers are more dependent on the 5′-TG-3′ motif.

Table 6.
Activity of mutants of different TG promoters

The aroF, ompF and gyrA promoters have been studied in more detail. We constructed sets of 15 derivatives of each promoter, which contained all possible combinations of bases at positions –15 and –14. Figure Figure22 illustrates the activity of each set of mutant promoters. The activities are compared with the activities of a similar set of mutant derivatives of the semi-synthetic KAB promoter studied by Burr et al. (15). The data show that the different bases at positions –15 and –14 result in similar patterns of activity. This argues strongly that the 5′-TG-3′ motif plays a similar role at different promoters.

Figure 2Figure 2
Activities of promoters with mutations at positions –15 and –14. Each panel illustrates the activity of families of different promoters with different base combinations at positions –15 and –14, as indicated on the abcissae. ...

The role of the dinucleotide at positions –17 and –16

In the final part of this study, we investigated the importance of the weakly conserved base sequence at positions –17 and –16. To do this, we used the aroF promoter, which was the strongest of the 11 selected TG promoters (Table (Table5).5). It contains –10, –35 and UP elements that correspond well to the consensus, two 5′-TTT-3′ tracts in the spacer region, and an extended –10 element, 5′-TGTG-3′. We constructed a set of 15 derivatives of the aroF promoter, which contained all possible combinations of bases at positions –17 and –16. Figure Figure33 illustrates the activity of each set of mutant promoters. The activities are compared with the activities of a similar set of mutant derivatives of the semi-synthetic KAB promoter studied by Burr et al. (15). The data show that the different bases at positions –17 and –16 result in similar patterns of activity, suggesting that these bases play a similar role at different promoters.

Figure 3
Activities of promoters with mutations at positions –17 and –16. The panels illustrate the activity of families of different promoters with different base combinations at positions –17 and –16, as indicated on the abcissae. ...

There is no direct evidence for contact between the bases at positions –17 and –16 of a promoter and σ. It is likely that a 5′-TR-3′ dinucleotide at this position affects the DNA structure and/or flexibility such that it aids the interaction of σ with the downstream promoter DNA. In Gram-positive bacteria, many promoters contain both a conserved –35 hexamer and an extended –10 motif; in such cases, the 5′-TG-3′ motif contributes to optimal activity (32,33). This also appears to be true in E.coli. For example, the aroF promoter contains both a 5′-TG-3′ motif and a –35 hexamer close to consensus, but is still dependent upon the extended –10 motif for optimal activity. Thus, the extended –10 motif may be important at all promoters where it is present, irrespective of the presence of other core promoter elements, but the level of dependence must be a function of these sequences. We conclude that the extended region immediately upstream of the –10 hexamer at many E.coli promoters works together with other promoter elements to drive optimal promoter activity.

ACKNOWLEDGEMENTS

This work was funded by a project grant from The Wellcome Trust and by a Darwin Trust Studentship to D.Z.

REFERENCES

1. McClure W.R. (1985) Mechanism and control of transcription in prokaryotes. Annu. Rev. Biochem., 54, 171–204. [PubMed]
2. Gross C.A., Chan,C., Dombroski,A., Gruber,T., Sharp,M., Tupy,J. and Young,B. (1998) The functional and regulatory roles of sigma factors in transcription. Cold Spring Harbor Symp. Quant. Biol., 63, 141–155. [PubMed]
3. Helmann J.D. and deHaseth,P.L. (1999) Protein–nucleic acid interaction during open complex formation investigated by systematic alteration of protein and DNA binding partners. Biochemistry, 38, 5959–5967. [PubMed]
4. Ross W., Gosink,K.K., Salomon,J., Igarashi,K., Zou,C., Ishihama,A., Severinov,K. and Gourse,R.L.(1993) A third recognition element in bacterial promoters: DNA binding by alpha subunit of RNA polymerase. Science, 262, 1407–1413. [PubMed]
5. Estrem S.T., Gaal,T., Ross,W. and Gourse,R.L. (1998) Identification of an UP element consensus sequence for bacterial promoters. Proc. Natl Acad. Sci. USA, 95, 9761–9766. [PMC free article] [PubMed]
6. Hawley D. and McClure,W.R. (1983) Compilation and analysis of Escherichia coli promoter DNA sequence. Nucleic Acids Res., 11, 2237–2256. [PMC free article] [PubMed]
7. Harley C.B. and Reynolds,R.P. (1987) Analysis of Escherichia coli promoter sequences. Nucleic Acids Res., 15, 2343–2361. [PMC free article] [PubMed]
8. Gross C.A., Chan,C.L. and Lonetto,M.A. (1992) Structure function analysis of Escherichia coli RNA polymerase. Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci., 351, 475–482. [PubMed]
9. deHaseth P.L. and Helmann,J.D. (1995) Open complex formation by Escherichia coli RNA polymerase: the mechanism of polymerase-induced strand separation of double helical DNA. Mol. Microbiol., 16, 817–824. [PubMed]
10. Moran C.P., Lang,N., LeGrice,S.F.J., Lee,G., Stephens,M., Sonenshein,A.L., Pero,J. and Losick,R. (1982) Nucleotide sequences that signal the initiation of transcription and translation in Bacillus subtilis. Mol. Gen. Genet., 186, 339–346. [PubMed]
11. Helmann J.D. (1995) Compilation and analysis of Bacillus subtilis σA-dependent promoter sequences: evidence for extended contact between RNA polymerase and upstream promoter DNA. Nucleic Acids Res., 23, 2351–2360. [PMC free article] [PubMed]
12. Sabelnikov A.G., Greenberg,B. and Lacks,S.A. (1995) An extended –10 promoter alone directs transcription of the DpnII operon of Streptococcus pneumoniae. J. Mol. Biol., 250, 144–155. [PubMed]
13. Bashyam M.D. and Tyagi,A.K. (1998) Identification and analysis of ‘extended –10’ promoters from mycobacteria. J. Bacteriol., 180, 2568–2573. [PMC free article] [PubMed]
14. McCracken A. and Timms,P. (1999) Efficiency of transcription from promoter sequence variants in Lactobacillus is both strain and context dependent. J. Bacteriol., 181, 6569–6572. [PMC free article] [PubMed]
15. Burr T., Mitchell,J., Kolb,A., Minchin,S. and Busby,S. (2000) DNA sequence elements located immediately upstream of the –10 hexamer in Escherichia coli promoters: a systematic study. Nucleic Acids Res., 28, 1864–1870. [PMC free article] [PubMed]
16. Voskuil M.I., Voepel,K. and Chambliss,G.H. (1995) The –16 region, a vital sequence for the utilization of a promoter in Bacillus subtilis and Escherichia coli. Mol. Microbiol., 17, 271–279. [PubMed]
17. Voskuil M.I. and Chambliss,G.H. (2002) The TRTGn motif stabilizes the transcription initiation open complex. J. Mol. Biol., 322, 521–532. [PubMed]
18. Lisser S. and Margalit,H. (1993) Compilation of E.coli mRNA promoter sequences. Nucleic Acids Res., 21, 1507–1516. [PMC free article] [PubMed]
19. Ozoline O.N., Deev,A.A., Arkhipova,M.V., Chasov,V.V. and Travers,A. (1999) Proximal transcribed regions of bacterial promoters have a non-random distribution of A/T tracts. Nucleic Acids Res., 27, 4768–4774. [PMC free article] [PubMed]
20. Berlyn M.K. (1998) Linkage map of Escherichia coli K-12, edition 10: the traditional map. Microbiol. Mol. Biol. Rev., 62, 814–984. [PMC free article] [PubMed]
21. Hanahan D. (1983) Studies on transformation of Escherichia coli with plasmids. J. Mol. Biol., 166, 557–580. [PubMed]
22. Lodge J., Fear,J., Busby,S., Gunasekaran,P. and Kamini,N.R. (1992) Broad host range plasmids carrying the Escherichia coli lactose and galactose operons. FEMS Microbiol. Lett., 74, 271–276. [PubMed]
23. Miller J.H. (1972) Experiments in Molecular Genetics. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
24. O’Neill M.C. (1989) Escherichia coli promoters consensus as it relates to spacing class, specificity, repeat substructure and three-dimensional organisation. J. Biol. Chem., 264, 5522–5530. [PubMed]
25. Galas D.J., Eggert,M. and Waterman,M.S. (1985) Rigorous pattern-recognition methods for DNA sequences: analysis of promoter sequences from Escherichia coli. J. Mol. Biol., 186, 117–128. [PubMed]
26. Collis C.M., Molloy,P.L., Both,G.W. and Drew,H.R. (1989) Influencce of the sequence-dependent flexure of DNA on transcription in E.coli.Nucleic Acids Res., 17, 9447–9468. [PMC free article] [PubMed]
27. Lozinski T., Markiewicz,W.T., Wyrzykiewicz,T.K. and Wyrzykiewicz,K.L. (1989) Effect of the sequence-dependent structure of the 17 bp AT spacer on the strength of consensus-like Escherichia coli promoters in vivo. Nucleic Acids Res., 17, 3855–3863. [PMC free article] [PubMed]
28. Keilty S. and Rosenberg,M. (1987) Constitutive function of a positively regulated promoter reveals new sequences essential for activity. J. Biol. Chem., 262, 6389–6395. [PubMed]
29. Chan B. and Busby,S. (1989) Recognition of nucleotide sequences at the Escherichia coli galactose operon P1 promoter by RNA polymerase. Gene, 84, 227–236. [PubMed]
30. Kumar A., Malloch,R.A., Fujita,N., Smillie,D.A., Ishihama,A. and Hayward,R.S. (1993) The minus 35 recognition region of Escherichia coli sigma 70 is inessential for initiation of transcription at an ‘extended minus 10’ promoter. J. Mol. Biol., 232, 406–418. [PubMed]
31. Belyaeva T., Griffiths,L., Minchin,S., Cole,J. and Busby,S. (1993) The Escherichia coli cysG promoter belongs to the ‘extended –10’ class of bacterial promoters. Biochem. J., 296, 851–857. [PMC free article] [PubMed]
32. Voskuil M.I. and Chambliss,G.H. (1998) The –16 region of Bacillus subtilis and other Gram-positive bacterial promoters. Nucleic Acids Res., 26, 3584–3590. [PMC free article] [PubMed]
33. Henkin T.M. and Sonenshein,A.L. (1987) Mutations of the Escherichia coli lacUV5 promoter resulting in increased expression in Bacillus subtilis. Mol. Gen. Genet., 209, 467–474. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...