• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of jbacterPermissionsJournals.ASM.orgJournalJB ArticleJournal InfoAuthorsReviewers
J Bacteriol. Sep 2005; 187(17): 6166–6174.
PMCID: PMC1196165

Immobilization of Escherichia coli RNA Polymerase and Location of Binding Sites by Use of Chromatin Immunoprecipitation and Microarrays

Abstract

The genome-wide location of RNA polymerase binding sites was determined in Escherichia coli using chromatin immunoprecipitation and microarrays (chIP-chip). Cross-linked chromatin was isolated in triplicate from rifampin-treated cells, and DNA bound to RNA polymerase was precipitated with an antibody specific for the β′ subunit. The DNA was amplified and hybridized to “tiled” oligonucleotide microarrays representing the whole genome at 25-bp resolution. A total of 1,139 binding sites were detected and evaluated by comparison to gene expression data from identical conditions and to 961 promoters previously identified by established methods. Of the detected binding sites, 418 were located within 1,000 bp of a known promoter, leaving 721 previously unknown RNA polymerase binding sites. Within 200 bp, we were able to detect 51% (189/368) of the known σ70-specific promoters occurring upstream of an expressed open reading frame and 74% (273/368) within 1,000 bp. Conversely, many known promoters were not detected by chIP-chip, leading to an estimated 26% negative-detection rate. Most of the detected binding sites could be associated with expressed transcription units, but 299 binding sites occurred near inactive transcription units. This map of RNA polymerase binding sites represents a foundation for studies of transcription factors in E. coli and an important evaluation of the chIP-chip technique.

RNA polymerase (RNAP) in Escherichia coli is a key factor in gene expression and catalyzes the transcription of DNA to mRNA for all genes (reviewed in references 33 and 37). The core enzyme is composed of four subunits: two α, one β, one β′, and one ω. Core RNAP becomes transcriptionally active holoenzyme with the addition of a σ factor (Fig. (Fig.1A).1A). E. coli possesses seven interchangeable σ factors, each with specificity for different promoters. Sigma factors function as global regulators of gene expression and mediate the transcriptional response to conditions under which a large number of genes need to be turned on or off, such as stationary phase or heat shock. Except for σ54, sigma factors do not bind to DNA except as part of the holoenzyme complex (9).

FIG. 1.
RNAP transcription cycle and the effect of rifampin. (A) Mechanisms of transcription initiation and rifampin binding. RNAP (blue) binds DNA to form a closed complex with affinities that differ among promoters by at least a factor of 100. The closed complex ...

To map promoters in bacteria, we sought a way to force RNAP to reside only at promoters so that identifying DNA fragments bound to RNAP in vivo would report promoter locations. A variety of small-molecule inhibitors of RNA polymerase were evaluated for the immobilization of RNAP, and rifampin was found to work best (M. Raffaelle, E. Kanin, J. Vogt, R. R. Burgess, A. Z. Ansari, unpublished data). The antibiotic rifampin inhibits bacterial growth by binding the β subunit of RNAP just upstream of the active site, blocking the synthesis of RNAs longer than 2 to 3 nucleotides (nt) (11) (Fig. (Fig.1A).1A). Rifampin has no effect on RNAP promoter binding to form closed complexes or on open complex formation (30) and has no effect on RNAP in vitro when added after elongating RNAP has cleared the promoter (12, 38). Upon addition of rifampin to a growing culture, the rifampin stops growth without killing cells by diffusion across bacterial membranes and tight binding to RNAP not engaged in transcription. RNAP holoenzyme complexed with rifampin can still bind to promoters but is trapped in an abortive cycle and unable to extend RNAs beyond 2 to 3 nt (Fig. (Fig.1A).1A). However, RNAP molecules in elongation complexes with RNA and DNA are resistant to rifampin binding. Once elongating RNAPs terminate transcription and release RNA and DNA, they become susceptible to rifampin binding, which traps them in newly formed open complexes.

A promoter is a DNA sequence to which RNAP binds and initiates the transcription of RNA. Knowledge of promoter locations is the first step in the elucidation of the transcriptional regulatory network. It allows the identification of which genes are cotranscribed and, from the DNA sequence, identification of regulatory motifs associated with different regulators. It also provides a basis for the interpretation of the binding site locations of various transcription factors. Promoters have been identified on an individual basis by primer extension and DNA footprinting studies, by which 961 promoters have been identified (www.cifn.unam.mx/Computational_Genomics/regulondb/) (36). In addition, DNA binding motifs have been used to predict the locations of 4,641 promoters throughout the genome (21, 40). New experimental evidence of promoter locations may validate or refute these predictions and will assist in understanding the function of uncharacterized genes and operons.

A powerful method combining chromatin immunoprecipitation and microarray analysis called chIP-chip (or genome-wide location analysis) has been developed for the global identification of transcription factor binding sites (10, 23, 34). In E. coli it has been used to identify the locations of MelR binding sites (16), and in yeast it has been used to identify the targets of over 200 transcription factors (18). Furthermore, chIP-chip was used to identify the targets of yeast RNA polymerase III (Pol III) and to investigate the roles of the accessory factors TFIIIB and TFIIIC (19, 31, 35).

These previous studies used microarrays spotted with PCR products of intergenic DNA and sometimes coding sequence as well. The resolution of the method depends on the size of the spotted PCR products in addition to the size of DNA fragments used in chromatin immunoprecipitation. Oligonucleotide microarrays have also been used (13) and hold the promise of greater resolution and the statistical confidence endowed by multiple signals from each region of the genome. However, the distribution of oligonucleotide probes is highly skewed in E. coli Affymetrix arrays and many regions of the genome are devoid of probes. Therefore, in this study we used a custom-designed oligonucleotide array from Nimblegen containing an evenly spaced “tiled” set of probes. Custom Nimblegen arrays have been used previously for chIP-chip of human transcription factors (25), but to our knowledge, this is the first publication describing the use of whole-genome, uniformly tiled arrays.

The accuracy and resolution of chIP-chip have been evaluated in a number of previous studies. For yeast RNA Pol III, essentially all known targets were correctly identified (19, 31, 35). A comparison of known transcription factor interactions to a large set of chIP-chip data revealed a false-negative rate of 80% (41). In the present study, we used rifampin to trap RNAP at promoters and compared promoter locations determined by chIP-chip to those determined by established methods. The existence of hundreds of known promoter locations in E. coli provides a novel opportunity for the evaluation of this technique.

MATERIALS AND METHODS

Chromatin immunoprecipitation.

In preparatory studies performed at least three times, E. coli K-12 MG1655 cells were grown in 100 ml LB medium in a shaking flask at 37°C. Chromatin immunoprecipitation was performed similarly to the method described in detail below. Endpoint quantitative PCR was performed for 30 cycles using primers designed to amplify from all seven rrn operons. For 16S the primers were GGAGGAATACCGGTGGCGAAGG and GCGTTAGCTCCGGAAGCCACGCCTC. For 23S they were CCAGGATGTTGGCTTAGAAGCAGCC and ACAGAACGCTCCCCTACCCAAC. For the promoter, the reverse-direction primer was GTCTGATAAATTGTTAAAGAGCAGT, while the forward-direction primers were a mixture of CTCCCTATAATGCGCCACCACTG, CTCCCTATAATGCGCCTCCATC, and ATCCCTATAATGCGCCTCCGTT.

For chIP-chip, three biological replicates were performed (three individual cultures). For each culture, two cell pellets of 50 ml culture were processed—one experimental sample as described below and a second control sample identical to the first but with no antibody added to the immunoprecipitation reaction. The chIP DNA of the first biological replicate was analyzed by real-time PCR using primers in the promoter regions of lpdA and fepB, which were measured to be present at levels 35- and 7-fold higher than background, respectively. The experimental and control PCR-amplified samples were labeled with Cy3 and Cy5 by Nimblegen, swapping the cyanine dyes in one biological replicate. The microarrays consisted of 371,034 oligonucleotides from the E. coli genome (release m56 accession no. NC_000913) plus controls. The oligonucleotides were 50 nt long and spaced every 25 nt on both top and bottom strands.

The method used for chromatin immunoprecipitation was adapted from Lin and Grossman (28). E. coli MG1655, obtained from the E. coli Genetic Stock Center (CGSC no. 7740), was inoculated from frozen stocks into M9 minimal medium-0.2% glucose and grown overnight at 37°C. It was then diluted 1:50 into 200 ml M9-0.2% glucose in a 500 ml flask, and the mouth of the flask was sealed with foil and stirred for 5 to 6 h to an optical density (OD) at 600 nm of ~0.4. Rifampin dissolved in methanol was added to a final concentration of 150 μg/ml and stirred for 20 min. Cultures were monitored by OD to verify the inhibitory effects of rifampin. A slight increase in OD was observed but with a greatly reduced growth rate. Samples (50 ml) were transferred to tubes containing formaldehyde and sodium phosphate (pH 7.6) such that the final concentrations were 1% and 10 mM, respectively. They were incubated at room temperature for 10 min with gentle agitation, and then glycine was added to 100 mM. They were gently agitated at 4°C for 30 min. Cells were centrifuged at 3,500 × g and washed twice in chilled phosphate-buffered saline. The cells were transferred to 1.5 ml tubes, centrifuged, and stored at −80°C.

Cells were resuspended in 0.5 ml of a solution containing 10 mM Tris (pH 8), 50 mM NaCl, 10 mM EDTA, and 20% sucrose, and then 0.2 μl Ready-lyse lysozyme (Epicenter) was added. They were incubated for 30 min at 37°C, and then 0.5 ml of a solution containing 200 mM Tris (pH 8), 600 mM NaCl, 4% Triton X-100, 1 mM phenylmethylsulfonyl fluoride, and 2 μg/ml RNase A was added. They were incubated at 37°C for 10 min and then chilled on ice. The lysates were sonicated four times for 20 s each time using a Sonics and Materials VC50 with a 3 mm microtip. Cell debris was removed by centrifugation, and a sample was removed and extracted with phenol and phenol:chloroform for DNA size analysis. DNAs ranged from 100 to 1,200 bp, with the greatest intensity at ~600 bp. Pan Mouse immunoglobulin G magnetic beads (Dynal catalog no. 110.22) (40 μl) were resuspended and washed per the manufacturer's instructions and then added to the lysates and rotated at 4°C for 3 h. The beads were removed using a magnet and discarded. A total of 2 μl of antibody for RNA polymerase β′ (Neoclone catalog no. W0001) was added, and the samples were rotated overnight at 4°C. Magnetic beads (50 μl) were resuspended and washed as before and then added to the samples and rotated 1 h at 4°C. The beads were recovered and then washed once in 250 mM LiCl-100 mM Tris (pH 8)-2% Triton X-100, twice in 100 mM Tris (pH 8)-600 mM NaCl-2% Triton X-100, twice in 100 mM Tris (pH 8)-300 mM NaCl-2% Triton X-100, and twice in TE (10 mM Tris-HCl, 1 mM EDTA). DNA and protein were eluted from the beads by resuspending in 50 mM Tris (pH 8)-10 mM EDTA-1% sodium dodecyl sulfate and then incubated at 65°C for 20 min. Beads were removed, cross-links were reversed by incubating at 65°C for 4 h, and then DNA was purified using Qiaquick (QIAGEN).

ChIP DNA was amplified based on the Random DNA Amplification method of June 2001 from http://www.microarrays.org/protocols.html. The following primers were designed specifically for E. coli: PF43 (TGGAAATCCGAGTGAGTNNNNNNNNN) and PF44 (TGGAAATCCGAGTGAGT). Round B amplification using exTaq (Takara) was performed for 35 to 40 cycles. Round C amplification was not performed. PCR products were purified with Qiaquick and then ethanol precipitated. Pellets were resuspended in water, and a sample was diluted in 1 mM Tris (pH 8) to measure the DNA concentration. Experimental samples (4 μg) and control samples (3 μg) were hybridized to microarrays by Nimblegen.

Data analysis.

For each array, the Cy5 signal values were multiplied by a scaling factor in Microsoft Access so that the total signals for Cy3 and Cy5 were equal. The log2 of the ratio of experimental signal to control signal was then calculated. The microarray was designed from both the top- and bottom-strand sequences, which means that for every probe, its reverse complement was also represented on the array. To distinguish them in the resulting data, 12 nt were added to the nominal genome position of probes from the bottom strand. Genome-scale data was visualized using Nimblegen's SignalMap software, and most other analysis was performed with Matlab version 6.5 (see the program code in the supplemental material). For each probe, the mean log2 ratio from all three biological replicates was calculated and then smoothed using a moving average with a window size of 500 bp. Peaks were identified using a reiterative process in which the probe with the highest log2 ratio was located upon each iteration. To prevent the same peak from being reidentified each time, the probe with the highest log2 ratio and all data within 1,000 bp were set to zero. This process was reiterated 1,000 times, and the resulting peaks were manually edited using SignalMap for visualization. An additional 139 peaks with a log2 ratio greater than 0.1 were then identified by the same method and manually edited.

RESULTS AND DISCUSSION

Effects of rifampin and experimental design.

We first determined the time required for rifampin to localize RNAP to promoters when added to mid-log E. coli cultures. The outer membrane of gram-negative bacteria like E. coli poses a significant barrier to rifampin, requiring much higher concentrations for inhibition of RNAP in vivo than in vitro (4). When 150 μg/ml rifampin was added, growth ceased almost immediately (Fig. (Fig.1B).1B). However, chIP signals in the rRNA operon persisted much longer, indicating a lack of coupling between rRNA transcription and cell growth. We found that 20 min of incubation with rifampin was required for chIP signals in the rRNA operon to reach near-background levels. Based on this result, we chose 20 min as a suitable time for rifampin treatment. This amount of time should allow detection of even the weakest promoters; when the occupancy rate of a promoter by RNAP is only once every 20 min, then a chIP-chip signal will be detectable, though slightly diminished.

Binding sites for RNA polymerase in the genome of E. coli strain MG1655 were identified using chIP-chip. Briefly, cells were grown to mid-log phase in minimal glucose medium and then treated for 20 min with rifampin. DNA and protein were cross-linked with formaldehyde, and cells were lysed. DNA was sheared to ~600 bp with sonication, and RNAP was immunoprecipitated using an antibody specific for the β′ subunit. DNA bound by RNAP and from a no-antibody mock immunoprecipitation control was amplified using random primers and then labeled with Cy3 and Cy5 fluorescent dyes and hybridized to a custom microarray consisting of 371,034 E. coli oligonucleotides spaced 25 bp apart across the whole genome.

Reproducibility.

This experiment was repeated three times. The reproducibility of the log2 ratios from each array is shown in Fig. Fig.2.2. Higher correlation was seen when comparing just the experimental channels (coefficients of 0.71, 0.63, and 0.84) or just the controls (0.66, 0.61, and 0.87). This level of correlation is lower than typically observed in microarray analysis of gene expression, but important differences should be considered. In this experiment, the whole genome was represented with oligonucleotide probes, and yet only a fraction of the genome should be present in the DNA that was hybridized to it. Thus, the majority of oligonucleotide probes on the array are not hybridized, leading to an increased amount of noise. The log2 ratios of chIP-chip experiments tend to have less dynamic range and greater noise than those from expression analysis (R. Green, Nimblegen, personal communication).

FIG. 2.
Reproducibility of RNA polymerase chIP-chip data. Log2 ratios of immunoprecipitated versus no-antibody control DNAs from three hybridizations are plotted against each other. A cluster of outliers in biological replicate 1 consists almost entirely of probes ...

Unlike a typical gene expression experiment, the absolute log2 ratio is not as important as the shape and location of peaks within the log2 ratio data, as visible in Fig. Fig.3.3. As expected from the high density of evenly spaced probes, the log2 ratio showed a clear pattern of peaks and valleys. When peaks were identified from each biological replicate individually, the mean standard deviation of peak location was 206 bp. The maximum log2 ratio in each biological replicate ranged from 4.2 to 5.9, and for all further analysis, replicates were combined by calculating the mean. To control for differences in hybridization efficiency of the oligonucleotides, the log2 ratios were smoothed with a moving average. In Fig. Fig.3B3B it can be seen that smoothing preserved the shape of peaks yet eliminated much of the noise.

FIG. 3.
A sample of RNA polymerase chIP-chip data. In both panels, genome position is indicated at the top and ORFs are shown at the bottom, with those above the line being in the forward orientation and those on bottom being the reverse. (A) rRNA-encoding genes ...

A large cluster of outliers is visible in the top two panels of Fig. Fig.2.2. These consist almost entirely of probes from biological replicate 1 located in genes encoding rRNA and ribosomal proteins, which are the most highly transcribed promoters in the genome. Almost 96% of probes with differences in log2 ratios > 2 between biological replicates 1 and 3 were located in rRNA genes or ribosomal protein operons. The variation at the rrnC operon in the three biological replicates can be seen in Fig. Fig.3A.3A. It appears that RNA polymerase was distributed throughout the operon in biological replicate 1 but was confined to two promoter regions in biological replicates 2 and 3. The rifampin stock used in biological replicate 1 was filter sterilized, which probably removed some rifampin and lowered the actual concentration. For biological replicates 2 and 3 the rifampin was not filtered. It seems likely that in biological replicate 1 the rifampin concentration was lower, and RNAP was able to escape from these very strong promoters. It should be noted that two peaks were observed at all of the seven rRNA gene operons. The first peak corresponds to the P1 and P2 promoters, indistinguishable at this scale. The other peak is probably a conserved promoter of unknown significance noted previously (2).

Binding site locations.

From the combined smooth data, 1,139 peaks were identified, representing binding sites for RNAP. They may also represent promoters, if RNA is transcribed from them. Binding sites less than a few hundred base pairs apart or on opposite strands are expected to appear as a single peak. Binding and initiation by RNAP is strongly dependent on other transcription factors and the conditions tested. Presumably, the sites we detected are a subset of the total binding sites in E. coli. For instance, peaks were not observed at the sites of either the araBAD or lacZYA promoters, probably because those binding sites are occluded by repressors in glucose medium.

There are ~2,800 RNAP molecules in E. coli in minimal medium (8) and 2,428 predicted transcription units (TUs) in the genome, not including ~40 TUs containing recently annotated small regulatory RNAs (TU predictions are from www.biostat.wisc.edu/gene-regulation/) (7). Of these TUs, 1,552 contain an open reading frame (ORF) that appears to be expressed under these conditions (log2 signal > 8.0 in gene expression data generated with Affymetrix arrays) (data from reference 14). Of these, 1,141 have 5′ ends that are more than 500 bp apart, providing an estimate of the number of binding sites that we might expect in this experiment. This number agrees remarkably well with our results. Of the 1,139 sites we detected, 958 occurred within 1,000 bp of the start site of a TU. For 659 of these, the TU contained an ORF with a log2 signal > 8.0 in gene expression data from the same growth conditions. Interestingly, this indicates 299 binding sites that are not associated with active transcription. In these cases, RNAP may bind to the chromosome but not activate transcription until a repressor or activator is triggered in response to environmental conditions, as is generally the case for transcription from σ54-specific promoters. It should be noted, though, that after treatment with rifampin, all active promoters may become saturated by RNAP, forcing binding to sites that are not normally bound. Additionally, some promoters may bind RNAP weakly and yet still be active by having high rates of open complex formation or promoter clearance.

Of the 1,139 binding sites, 501 occurred in intergenic regions and 638 occurred in ORFs. Those in ORFs can be subdivided into three groups on the basis of the distance from the site to the nearest 5′ end of an ORF (median distance = 158 bp). The first group consists of binding sites that are probably located upstream of the ORF but mistakenly identified inside the ORF due to poor resolution of the chIP-chip technique. In this group are 372 sites for which the nearest 5′ end of an ORF is that of the ORF in which they occur and is <500 bp away. The second group consists of sites that are probably promoters for a downstream ORF. In this group are 191 sites for which the nearest 5′ end of an ORF is an ORF that is different from the one in which it occurs and is <500 bp away. The third group consists of sites that occur >500 bp away from the 5′ end of any ORF (see Table S3 in the supplemental material). Of these 75 sites, many have small log2 ratios and occur in insertion elements or in genes of unknown function. The others occur either in the middle of an ORF (sites in sapA, rfbX, evgS, emrA, nlpD, pyrG, xylG, and fimD) or near the intergenic region between two convergent genes (sites in pinQ, intQ, and otsA). These sites may play a regulatory role, possibly as RNAP pause sites or promoters of antisense transcripts, or they may be cryptic promoter-like elements ordinarily masked by transcribing RNAP.

Correlation to other factors.

There was very little correlation between the chIP-chip log2 ratio and the similarity of known promoters to a σ70 consensus binding matrix (Fig. (Fig.4A).4A). This result is surprising given that (i) the relative efficiency of chIP, often measured as “occupancy,” has been observed to correlate with in vitro binding affinity (39) and (ii) RNAP has high affinity for promoters containing sequences close to consensus (26). A possible explanation for this lack of correlation may be the excess of RNAP molecules over the number of potential binding sites in the presence of rifampin. In excess and “trapped” at promoters, RNAP may bind to all sites equally. We also found very little correlation between the log2 ratio of chIP-chip peaks and the expression of the nearest downstream ORF (Fig. (Fig.4B),4B), which is not surprising considering that the log2 ratio correlates poorly with the consensus matrix. This observation differs from studies of yeast Pol III in which the log2 ratio was directly correlated with transcription (35). The discrepancy may be due to biological differences between eukaryotic and prokaryotic RNAP, such as the ratio of enzyme molecules to potential binding sites or to disparate mechanisms of gene regulation. More likely, it is due to experimental differences; in the study performed by Roberts et al. (35) rifampin was not used to trap RNAP at the promoters and occupancy was measured in a transcribed region.

FIG. 4.
Poor correlation between the magnitude of the RNAP chIP-chip log2 ratio, gene expression, and promoter similarity to the consensus matrix. (A) The RNAP binding site log2 ratio was plotted against the similarity of the promoter to consensus. Of 681 known ...

We examined the distribution and periodicity of binding sites in the E. coli genome. The densities of binding sites differed throughout the genome, ranging from 13 to 36 sites per 100 kb. There was no correlation between the number of binding sites per 100 kb and the average predicted TU size in each 100-kb region (correlation coefficient = −0.11; TU predictions from Bockhorst et al.) (7). We did not observe any significant periodicity in the location of binding sites, but we did detect periodicity when both the location and log2 ratio of the peaks were considered (Fig. (Fig.5).5). A wavelet transform (6, 29) revealed that the average log2 ratio ranged in a region from ~1,900 to 3,400 kb with periodicity of ~660 kb, which is significant relative to the periodicity found in randomized data. The regions of high and low average log2 ratios correlated with regions of high and low average gene expression levels (Fig. (Fig.5C).5C). Periodicity of this sort has been observed for many different genomic parameters across the evolutionary spectrum, though the meaning, whether related to the physical organization of the genome or some other factor, is unknown (1, 24) (T. E. Allen, N. D. Price, A. R. Joyce, B. Ø. Palsson, unpublished data). It is unusual that only a fraction of the genome shows periodicity for RNAP binding sites. It is also interesting that a correlation between log2 ratio and gene expression was only observed when the log2 ratios were smoothed over 330 kb. On a gene-by-gene basis there may have been too much noise to see this trend.

FIG. 5.
Density and periodicity of RNA polymerase binding sites in glucose minimal medium as detected by chIP-chip. (A) The log2 ratios of 1,139 binding sites were averaged in each 1-kb section of the genome. Those sections lacking a binding site were assigned ...

Comparison to locations of known promoters.

To validate the RNAP binding sites, we examined the ability of chIP-chip to detect 961 known promoters compiled in RegulonDB (36). For each known promoter, the nearest peak was located in the RNAP chIP-chip data (Table (Table1).1). A total of 418 RNAP binding sites were located within 1,000 bp of a known promoter, leaving 721 binding sites identified by chIP-chip that have not been previously characterized. Of the known promoters detected, most were specific to σ70, but high percentages of σ24-, σ28-, and σ32-specific promoters were also detected. These sigma factors are associated with the transcription of genes for flagella biosynthesis and stress response. Gene expression measurements indicate that the genes downstream from σ28-specific promoters are probably not expressed under these conditions (average log2 signal = 7.13), while those downstream of σ24- and σ32-specific promoters probably are (average log2 signals of 9.62 and 10.71, respectively). The expression of stress response genes does not necessarily indicate the presence of stress; heat-shock proteins are normally expressed at 37°C (20), and σ32 is required for growth above 20°C (42). The log2 ratio of chIP-chip peaks near σ24- and σ32-specific promoters showed poor correlation to gene expression (r = 0.17). Few σ38- and σ54- specific promoters were detected by chIP-chip. These sigma factors are associated with the transcriptional response to stationary phase and nitrogen assimilation, respectively.

TABLE 1.
Proximity of known promoters to RNAP binding sites detected in this study

These differences in the proportions of known promoters detected do not appear to match the relative abundances of the sigma factors. During logarithmic growth in LB medium, protein measurements showed that σ70 is predominant, followed in abundance by σ28 and σ54, while the others were undetectable (22). Under the conditions used here for chIP-chip, the mRNAs for all of the sigma factors appeared to be expressed, though σ19 and σ28 were less abundant (Table (Table1).1). In the presence of rifampin, the alternate sigma factors may have an increased ability to form complexes with core RNAP due to an increased level of free core RNAP and the depletion of σ70 through binding with RNAP at promoters. The σ38- and σ54-specific promoters may be less often detected because of the activity of other transcription factors precluding the binding of RNAP to σ38- and σ54-specific promoters under these conditions. In any case, the detection of binding sites for the alternative sigma factors indicates that either those sigma factors are present at some level under these conditions or there is degeneracy in the binding specificity of σ70-holoenzyme in vivo.

The distribution of distances between known σ70-specific promoters and chIP-chip RNAP binding sites is shown in Fig. Fig.6.6. We found that 49% of known σ70-promoters correlated to a chIP-chip peak within 200 bp, and 73% correlated within 1,000 bp. Of 681 known σ70-promoters, 532 occur upstream of an apparently expressed ORF (the nearest downstream ORF had a log2 signal > 8.0). Of these, 368 were separated by at least 500 bp, and out of 368 sites, 273 could be linked to a chIP-chip peak within 1,000 bp. This leaves 95 known expressed promoters that we were unable to detect, resulting in a 26% (95/368) false-negative rate. This compared favorably to chIP-chip experiments in yeast, for which a false-negative rate of 80% has been estimated (41). It falls short, though, of the nearly perfect detection of RNA Pol III binding sites (19, 31, 35). The false-positive rate is impossible to estimate, because binding does not always result in transcription and because of the lack of any other viable methods for measuring protein binding in vivo.

FIG. 6.
Histogram of the distance from 681 known σ70 promoters to the nearest RNA polymerase binding site (peak) observed in chIP-chip data.

False negatives.

A variety of factors could explain the false negatives. One explanation may be failure to amplify some parts of the chromosome by random PCR or low efficiency of formaldehyde cross-linking at some promoters (17). As chIP-chip data becomes increasingly common, characteristics of poorly cross-linked proteins or DNA sequences may emerge. It is also possible that the epitope for the antibody we used was occluded by other transcription factors at some promoters. Antibodies specific to other RNAP subunits are available, and it may be found that their use detects an overlapping set of RNAP binding sites. Results obtained by performing a single replicate of chIP-chip with antibodies for σ70 and β′ with Affymetrix arrays showed that the σ70 sites were a subset of the sites detected using antibody for β′, as expected (see Table S4 and Fig. S2 in the supplemental material). Of 13,499 probes with σ70 log2 ratios > 1, 97% had similar β′ log2 ratios (difference < 1). On the other hand, of 15,032 probes with β′ log2 ratios > 1, only 74% had similar σ70 log2 ratios. The log2 ratios from the β′ and σ70 Affymetrix chIP-chip experiments correlated with a coefficient of 0.72. The log2 ratios from the Affymetrix chips were compared to those from the Nimblegen chips. For each Affymetrix probe, the nearest Nimblegen probe in the same genomic orientation was identified, correcting for the different versions of the E. coli genome sequence that were used in chip design. The correlation coefficients were 0.60 for comparisons of Affymetrix β′ to Nimblegen β′ and 0.64 for comparisons of Affymetrix σ70 to Nimblegen β′. Affymetrix arrays were not used in any further experiments because of the incomplete coverage of the E. coli genome provided by Affymetrix expression arrays (see Fig. S2 in the supplemental material).

Another likely explanation for failure to detect RNAP at known σ70 or other sigma factor promoters is the high variability of the rates of open complex formation and promoter escape and of the affinity of promoters for initial binding of RNAP holoenzyme (33). These same parameters may explain why it takes so long to clear RNAP from the rRNA operons. At some E. coli promoters, open complexes are so unstable that they initiate immediately upon binding nucleoside triphosphate without the slow promoter escape and high probability of abortive initiation observed at many promoters. The rrn P1 promoter is a classic example of such a promoter. It exhibits no abortive initiation and has the added features of highly efficient initial RNAP binding and the ability to sequester up to 80% of transcribing RNAP into elongation complexes on the seven rrn operons in E. coli (15, 32). RNAP released at a terminator may rebind an rRNA promoter and reform an rrn elongation complex more rapidly than rifampin can bind to it (σ70 can associate with RNAP during termination) (3). Thus, RNAP may become trapped at rRNA promoters only after multiple rounds of transcription and when the rifampin concentration inside the cell accumulates to higher concentration, accounting for the slow clearance of RNAP from the rrn operon (Fig. (Fig.3A).3A). Other promoters, which form unstable open complexes but lack the avid initial RNAP binding and rapid promoter escape properties of rrn promoters, may never accumulate significant RNAP, because RNAP will equilibrate away to promoters able to form stable open complexes. Thus, certain promoters that form unstable open complexes may escape detection. Consideration of the kinetic properties of promoters will be an important issue in the development of promoter mapping by the rifampin-trapping chIP-chip method.

It should also be noted that many of the sites we identified are likely to be specific to the minimal glucose conditions used here. Expansion of this experiment to other culture conditions will likely reduce the number of undetected known promoters and reveal differences in RNAP binding resulting from the activity of the regulatory network. Gene expression data also show changing patterns under different conditions but are often difficult to interpret. By using chIP-chip, direct regulatory interactions can be distinguished from indirect regulatory interactions (27, 34), but care should be practiced given the false-negative rate of 26% that we observed.

Closing remarks.

Transcriptional regulation can be seen as a combinatorially based molecular “computation.” In this analogy, the binding and interaction of various transcription factors with RNAP are the input variables and transcription level is the output. The identification of transcription factors involved in each “computation” is a fundamental step in understanding observations of gene expression. As each new factor is identified by chIP-chip or some other technique, its role will be defined by its interaction and colocation with RNAP. The work presented here lays the foundation for future work in elucidating the regulatory network of E. coli. The location data can also be used for validation of computational predictions of promoters, and the log2 values may provide insight into some characteristic of RNAP binding or activity. Whether the observed false-negative detection of known promoters reveals an inherent limitation of the technique or is merely a peculiarity of the experimental conditions is yet to be seen. A comparison of experiments with or without rifampin added, under different growth conditions, and with antibodies for various sigma factors will shed light on this question.

ADDENDUM IN PROOF

After this article was accepted, a genome-wide map of human RNA polymerase II preinitiation complexes made using chIP-chip and Nimblegen microarrays was published (T. H. Kim, L. O. Barrera, M. Zheng, C. Qu, M. A. Singer, T. A. Richmond, Y. Wu, R. D. Green, and B. Ren. Nature doi:10.1038/nature03877 [29 June 2005]). A large number of false negatives were encountered in that study as well, which the authors attribute to lack of sensitivity in microarray detection of immunoprecipitated DNA.

Supplementary Material

[Supplemental material]

Acknowledgments

The analysis of known promoters would not have been possible without the data provided by Araceli Huerta-Moreno, Heladia Salgado-Osorio, and Julio Collado-Vides. We thank B. Ren, B. K. Cho, and A. Joyce as well as H. Holster, K. Nuwaysir, and M. Hogan at Nimblegen for their help.

This work was initiated at the University of Wisconsin Madison with the support of University of Wisconsin I&EDR funds, Hatch McIntyre Stennis (U.S. Department of Agriculture), and University of Wisconsin trust funds (to A.Z.A.) and a grant from the National Institutes of Health (GM38660 to R.L.) and completed at the University of California San Diego with support from the National Institutes of Health (GM62791 to B.Ø.P.). C.D.H. was supported in part by NLM 5T15LM007359.

B.Ø.P. serves on the Scientific Advisory Board of Genomatica Inc., a spin-off company from the University of California San Diego.

Footnotes

Supplemental material for this article may be found at http://jb.asm.org/.

REFERENCES

1. Allen, T. E., M. J. Herrgard, M. Liu, Y. Qiu, J. D. Glasner, F. R. Blattner, and B. O. Palsson. 2003. Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets. J. Bacteriol. 185:6392-6399. [PMC free article] [PubMed]
2. Amemiya, K., V. Bellofatto, L. Shapiro, and J. Feingold. 1986. Transcription initiation in vitro and in vivo at a highly conserved promoter within a 16 S ribosomal RNA gene. J. Mol. Biol. 187:1-14. [PubMed]
3. Arndt, K. M., and M. J. Chamberlin. 1988. Transcription termination in Escherichia coli. Measurement of the rate of enzyme release from Rho-independent terminators. J. Mol. Biol. 202:271-285. [PubMed]
4. Atherly, A. G. 1974. Ribonucleic acid regulation in permeabilized cells of Escherichia coli capable of ribonucleic acid and protein synthesis. J. Bacteriol. 118:1186-1189. [PMC free article] [PubMed]
5. Benjamini, Y., and Y. Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57:289-300.
6. Bentley, P. M., and J. T. E. McDonnell. 1994. Wavelet transforms: an introduction. Electron. Commun. Eng. J. 6:175-186.
7. Bockhorst, J., Y. Qiu, J. Glasner, M. Liu, F. Blattner, and M. Craven. 2003. Predicting bacterial transcription units using sequence and expression data. Bioinformatics 19(Suppl. 1):i34-i43. [PubMed]
8. Bremer, H., and P. P. Dennis. 1996. Modulation of chemical composition and other parameters of the cell by growth rate, 1553-1569. In F. C. Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, H. E. Umbarger (ed.), Escherichia coli and Salmonella: cellular and molecular biology, 2nd ed. ASM Press, Washington, D.C.
9. Buck, M., and W. Cannon. 1992. Specific binding of the transcription factor sigma-54 to promoter DNA. Nature 358:422-424. [PubMed]
10. Buck, M. J., and J. D. Lieb. 2004. ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83:349-360. [PubMed]
11. Campbell, E. A., N. Korzheva, A. Mustaev, K. Murakami, S. Nair, A. Goldfarb, and S. A. Darst. 2001. Structural mechanism for rifampicin inhibition of bacterial RNA polymerase. Cell 104:901-912. [PubMed]
12. Carpousis, A. J., and J. D. Gralla. 1985. Interaction of RNA polymerase with lacUV5 promoter DNA during mRNA initiation and elongation. Footprinting, methylation, and rifampicin-sensitivity changes accompanying transcription initiation. J. Mol. Biol. 183:165-177. [PubMed]
13. Cawley, S., S. Bekiranov, H. H. Ng, P. Kapranov, E. A. Sekinger, D. Kampa, A. Piccolboni, V. Sementchenko, J. Cheng, A. J. Williams, R. Wheeler, B. Wong, J. Drenkow, M. Yamanaka, S. Patel, S. Brubaker, H. Tammana, G. Helt, K. Struhl, and T. R. Gingeras. 2004. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116:499-509. [PubMed]
14. Covert, M. W., E. M. Knight, J. L. Reed, M. J. Herrgard, and B. O. Palsson. 2004. Integrating high-throughput and computational data elucidates bacterial networks. Nature 429:92-96. [PubMed]
15. Gourse, R. L., T. Gaal, S. E. Aiyar, M. M. Barker, S. T. Estrem, C. A. Hirvonen, and W. Ross. 1998. Strength and regulation without transcription factors: lessons from bacterial rRNA promoters. Cold Spring Harbor Symp. Quant. Biol. 63:131-139. [PubMed]
16. Grainger, D. C., T. W. Overton, N. Reppas, J. T. Wade, E. Tamai, J. L. Hobman, C. Constantinidou, K. Struhl, G. Church, and S. J. Busby. 2004. Genomic studies with Escherichia coli MelR protein: applications of chromatin immunoprecipitation and microarrays. J. Bacteriol. 186:6938-6943. [PMC free article] [PubMed]
17. Hanlon, S. E., and J. D. Lieb. 2004. Progress and challenges in profiling the dynamics of chromatin and transcription factor binding with DNA microarrays. Curr. Opin. Genet. Dev. 14:697-705. [PubMed]
18. Harbison, C. T., D. B. Gordon, T. I. Lee, N. J. Rinaldi, K. D. Macisaac, T. W. Danford, N. M. Hannett, J. B. Tagne, D. B. Reynolds, J. Yoo, E. G. Jennings, J. Zeitlinger, D. K. Pokholok, M. Kellis, P. A. Rolfe, K. T. Takusagawa, E. S. Lander, D. K. Gifford, E. Fraenkel, and R. A. Young. 2004. Transcriptional regulatory code of a eukaryotic genome. Nature 431:99-104. [PMC free article] [PubMed]
19. Harismendy, O., C. G. Gendrel, P. Soularue, X. Gidrol, A. Sentenac, M. Werner, and O. Lefebvre. 2003. Genome-wide location of yeast RNA polymerase III transcription machinery. EMBO J. 22:4738-4747. [PMC free article] [PubMed]
20. Herendeen, S. L., R. A. VanBogelen, and F. C. Neidhardt. 1979. Levels of major proteins of Escherichia coli during growth at different temperatures. J. Bacteriol. 139:185-194. [PMC free article] [PubMed]
21. Huerta, A. M., and J. Collado-Vides. 2003. Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals. J. Mol. Biol. 333:261-278. [PubMed]
22. Ishihama, A. 2000. Functional modulation of Escherichia coli RNA polymerase. Annu. Rev. Microbiol. 54:499-518. [PubMed]
23. Iyer, V. R., C. E. Horak, C. S. Scafe, D. Botstein, M. Snyder, and P. O. Brown. 2001. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409:533-538. [PubMed]
24. Jeong, K. S., J. Ahn, and A. B. Khodursky. 2004. Spatial patterns of transcriptional activity in the chromosome of Escherichia coli. Genome Biol. 5:R86. [PMC free article] [PubMed]
25. Kirmizis, A., S. M. Bartley, A. Kuzmichev, R. Margueron, D. Reinberg, R. Green, and P. J. Farnham. 2004. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27. Genes Dev. 18:1592-1605. [PMC free article] [PubMed]
26. Kobayashi, M., K. Nagata, and A. Ishihama. 1990. Promoter selectivity of Escherichia coli RNA polymerase: effect of base substitutions in the promoter −35 region on promoter strength. Nucleic Acids Res. 18:7367-7372. [PMC free article] [PubMed]
27. Laub, M. T., S. L. Chen, L. Shapiro, and H. H. McAdams. 2002. Genes directly controlled by CtrA, a master regulator of the Caulobacter cell cycle. Proc. Natl. Acad. Sci. USA 99:4632-4637. [PMC free article] [PubMed]
28. Lin, D. C., and A. D. Grossman. 1998. Identification and characterization of a bacterial chromosome partitioning site. Cell 92:675-685. [PubMed]
29. Lio, P. 2003. Wavelets in bioinformatics and computational biology: state of art and perspectives. Bioinformatics 19:2-9. [PubMed]
30. McClure, W. R., and C. L. Cech. 1978. On the mechanism of rifampicin inhibition of RNA synthesis. J. Biol. Chem. 253:8949-8956. [PubMed]
31. Moqtaderi, Z., and K. Struhl. 2004. Genome-wide occupancy profile of the RNA polymerase III machinery in Saccharomyces cerevisiae reveals loci with incomplete transcription complexes. Mol. Cell. Biol. 24:4118-4127. [PMC free article] [PubMed]
32. Paul, B. J., W. Ross, T. Gaal, and R. L. Gourse. 2004. rRNA transcription in Escherichia coli. Annu. Rev. Genet. 38:749-770. [PubMed]
33. Record, M. T., Jr., W. S. Reznikoff, M. L. Craig, K. L. McQuade, and P. J. Schlax. 1996. Escherichia coli RNA polymerase (Eσ70), promoters, and the kinetics of the steps of transcription initiation, 792-821. In F. C. Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, H. E. Umbarger (ed.), Escherichia coli and Salmonella: cellular and molecular biology, 2nd ed. ASM Press, Washington, D.C.
34. Ren, B., F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J. Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J. Wilson, S. P. Bell, and R. A. Young. 2000. Genome-wide location and function of DNA binding proteins. Science 290:2306-2309. [PubMed]
35. Roberts, D. N., A. J. Stewart, J. T. Huff, and B. R. Cairns. 2003. The RNA polymerase III transcriptome revealed by genome-wide localization and activity-occupancy relationships. Proc. Natl. Acad. Sci. USA 100:14695-14700. [PMC free article] [PubMed]
36. Salgado, H., S. Gama-Castro, A. Martinez-Antonio, E. Diaz-Peredo, F. Sanchez-Solano, M. Peralta-Gil, D. Garcia-Alonso, V. Jimenez-Jacinto, A. Santos-Zavaleta, C. Bonavides-Martinez, and J. Collado-Vides. 2004. RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12. Nucleic Acids Res. 32:D303-D306. [Online.] [PMC free article] [PubMed]
37. Severinov, K. 2000. RNA polymerase structure-function: insights into points of transcriptional regulation. Curr. Opin. Microbiol. 3:118-125. [PubMed]
38. Sippel, A., and G. Hartmann. 1968. Mode of action of rifamycin on the RNA polymerase reaction. Biochim. Biophys. Acta 157:218-219. [PubMed]
39. Szak, S. T., D. Mays, and J. A. Pietenpol. 2001. Kinetics of p53 binding to promoter sites in vivo. Mol. Cell. Biol. 21:3375-3386. [PMC free article] [PubMed]
40. Thieffry, D., H. Salgado, A. M. Huerta, and J. Collado-Vides. 1998. Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12. Bioinformatics 14:391-400. [PubMed]
41. Yeung, K. Y., M. Medvedovic, and R. E. Bumgarner. 2004. From co-expression to co-regulation: how many microarray experiments do we need? Genome Biol. 5:R48. [PMC free article] [PubMed]
42. Zhou, Y. N., N. Kusukawa, J. W. Erickson, C. A. Gross, and T. Yura. 1988. Isolation and characterization of Escherichia coli mutants that lack the heat shock sigma factor σ32. J. Bacteriol. 170:3640-3649. [PMC free article] [PubMed]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...