![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||
Copyright Published by Oxford University Press 2006 Anatomy of Escherichia coli σ70 promoters Center for Cancer Research Nanobiology Program, National Cancer Institute at Frederick, PO Box B, Building 469, Room 144, Frederick, MD 21702-1201, USA *To whom correspondence should be addressed. Tel: +1 301 846 5581; Fax: +1 301 846 5598; Email: toms/at/ncifcrf.gov Present addresses: Ryan K. Shultzaberger, Department of Molecular and Cell Biology, University of California, 16 Barker Hall, Berkeley, CA 94720-3202, USA Karen A. Lewis, Department of Physiology, University of Texas Southwestern Medical Center at Dallas, 5323 Harry Hines Blvd, Dallas, TX 75390-9040, USA Received September 9, 2006; Revised October 23, 2006; Accepted October 24, 2006. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Information theory was used to build a promoter model that accounts for the −10, the −35 and the uncertainty of the gap between them on a common scale. Helical face assignment indicated that base −7, rather than −11, of the −10 may be flipping to initiate transcription. We found that the sequence conservation of σ70 binding sites is 6.5 ± 0.1 bits. Some promoters lack a −35 region, but have a 6.7 ± 0.2 bit extended −10, almost the same information as the bipartite promoter. These results and similarities between the contacts in the extended −10 binding and the −35 suggest that the flexible bipartite σ factor evolved from a simpler polymerase. Binding predicted by the bipartite model is enriched around 35 bases upstream of the translational start. This distance is the smallest 5′ mRNA leader necessary for ribosome binding, suggesting that selective pressure minimizes transcript length. The promoter model was combined with models of the transcription factors Fur and Lrp to locate new promoters, to quantify promoter strengths, and to predict activation and repression. Finally, the DNA-bending proteins Fis, H-NS and IHF frequently have sites within one DNA persistence length from the −35, so bending allows distal activators to reach the polymerase. INTRODUCTION Transcriptional regulation is essential to the viability of the cell (1–3). In prokaryotes, many molecules can contribute to or detract from the stability of the initiation complex (4). The minimum requirement for RNA polymerase binding is recognition of the promoter by the σ factor (5–8). In general, prokaryotic RNA polymerases can interchange a number of σ factors which bind and initiate different groups of genes (9). σ70 is the most commonly used σ factor in Escherichia coli and it is responsible for the initiation of most genes (9). This paper will only focus on promoters bound by σ70. To successfully model initiation, it is necessary to construct a model that unifies multiple components. The conventional model for promoter recognition by σ70 is the binding of two regions upstream of the transcription start point, named the −10 and −35 because of their spacing relative to the first transcribed base (10,11). The initiation complex is also further stabilized by the C-terminal domain of the two α subunits of the core enzyme (αCTD), which can either interact directly with upstream DNA or with regulatory proteins (12). To add to the complexity of the system, recognition of the −10 alone can be sufficient for initiation to occur (13–15). The initiating polymerase can be thought of as moving ship that needs to be anchored down (16). The varying affinities of the binding components for the promoter would correlate to varying weights holding the polymerase in place. The sum of these components must have enough energy to stabilize the polymerase against thermal noise. Therefore, in order to model promoter binding, we need to consider the relative affinity of each molecule affecting the stability of the initiation complex. In addition, the σ factor is flexible. That is, the distance between the −10 and −35 binding sites is not fixed. This flexibility affects the affinity of the polymerase for the sequence (10). If we treat σ factor bound to core as a simple harmonic oscillator, then expansion or contraction of the polymerase when binding to promoters with varying spacings would strain the molecule and reduce the amount of energy available for stabilization. Since the initiation rate is affected by spacing (10,17–21), our model needs to take into account this internal strain. Traditionally three possible spacings have been proposed at which the −10 and the −35 bind relative to each other, 17 ± 1 bases, but initiation over a larger range, 15–20 bases, has been shown (10,19,22). These spacings correspond to the number of bases between the 3′ end of the −35 hexamer and the 5′ end of the −10 hexamer. The observed optimal spacing of 17 bases (10) places the centers of the two hexamers on the same face of the DNA 23 bases apart, ~2 helical twists of B-form DNA (23), suggesting that the polymerase has a DNA-structure-dependent contact. There is also a correlation between the extent of negative supercoiling and the amount of transcription from a promoter (21,22), which demonstrates that the σ factor is sensitive to genomic structure (24,25). Although neural networks and hidden Markov models (HMMs) have been used to model promoter binding (26–28), constructing these models usually requires the untenable assumption that large stretches of sequence do not contain sites, and the resulting parameters have not been easy to interpret. Other attempts have been made to model promoters using methods not based on HMMs (29–33), but these methods do not uniformly measure the contribution of all components in the initiation complex (−10, −35 and gap). Hertz and Stormo (31) presented a model in which they subtracted the gap penalty for the optimal spacing from each gap value, so that there was no penalty for having the optimal spacing. Their formula to evaluate gap penalties implies that a set of sites that have several equiprobable gap lengths would have no penalties, even though flexibility decreases information (34) (http://www.ccrnp.ncifcrf.gov/~toms/paper/flexrbs/). The method used in this paper, which was previously used to investigate ribosome binding sites (34), does account for gap variability with equiprobable gap lengths. In addition, promoter strengths are not determined purely by the binding of the σ factor. Transcriptional activators and repressors contribute to and detract from the accessibility of DNA by the RNA polymerase. In order to uniformly model the flexible binding of the σ70 in conjunction with transcriptional regulators, we used information theory. Information theory was developed by Claude Shannon to quantify the transfer of information in communications (35,36) (http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html). It has proven to be useful when applied to a variety of biological systems (37–41) (http://www.ccrnp.ncifcrf.gov/~toms/paper/schneider1986/; http://www.ccrnp.ncifcrf.gov/~toms/paper/rfs/; http://www.ccrnp.ncifcrf.gov/~toms/paper/fisinfo/; http://www.ccrnp.ncifcrf.gov/~toms/paper/lrp/; http://www.ccrnp.ncifcrf.gov/~toms/paper/baseflip/) mainly in quantifying how specific a given DNA-binding protein is, based on the amount of variability within its binding targets. A lower binding site variability corresponds to a higher information content (37). For convenience, information is generally measured in bits, the choice between two equally likely possibilities. A greater information content for a set of binding sites (more bits of information) generally will have more specific binding and a higher binding affinity. Prokaryotic ribosomes, like the σ factor, have two binding elements separated by a variable distance. In previous work, we used information theory to model 95% of the E. coli ribosome binding sites (34). This flexible model took into account the conservation of both the initiation codon and Shine–Dalgarno regions, and the statistics of the variable spacing between them. Here we applied the same theory to the σ70 promoter components (−35, −10 and their spacing) to create a cohesive model of promoter binding. An important difference between the ribosome model and the promoter model is that the ribosome model is only composed of two binding elements. The promoter model can be made up of a large number of binding elements with parameters governing how each element behaves, and how the elements interact with each other. This paper shows how an internally consistent multi-part model can be constructed directly from experimentally proven sites, and discusses how this model can be used to predict and identify control systems. We are also interested in understanding the fundamental workings of the RNA polymerase. To this end, we examined the variation in the promoter as a function of spacing, global trends in accessory molecule binding relative to the promoters, and the relationship between ribosome binding sites and promoters. We also propose that the flexible bipartite binding site evolved from a rigid extended −10 binding mode. MATERIALS AND METHODS Constructing the promoter model We built our σ70 binding model by aligning and refining the sequences upstream of 599 experimentally determined transcription starts reported in the RegulonDb database (42) and 85 starts from the PromEC database (43) that were not included in RegulonDb. To align binding sites we used the malign program to maximize the information of either the −10 or −35 by shuffling the sequences (44) (http://www.ccrnp.ncifcrf.gov/~toms/paper/malign). To refine the model, we iteratively removed all sequences with an information content <0 bits of information (45,46) (http://www.ccrnp.ncifcrf.gov/~toms/paper/edmm/; http://www.ccrnp.ncifcrf.gov/~toms/paper/ri/) until we converged on a consistent set of sequences. Since there is both a variable spacing between the −10 and the transcription start point, and between the −35 and the −10, this process was not trivial, and we describe below how we converged on our final model. To align the −10, we embedded the DNA sequences −15 to −3 bases upstream of the transcription start site in random DNA, so that our alignment would not be biased by the −35 or by the preference of adenine at the transcription start point. We realigned the −10 region to maximize the information by using the malign program (44) over the range of −12 to −7 bases upstream of the transcription start, allowing for the sequences to shift up to 3 bases in either direction. Since nearby transcription start points could potentially use the same −10, we identified and removed transcription starts from our dataset that were within 15 bases of another site with a lower genomic coordinate (arbitrarily chosen) and had the same orientation. This prevented the same −10 from appearing multiple times in our model, and decreased the size of our dataset from 684 to 620 starts. We then did a cyclic refinement on these sites to remove sequences from our dataset that were not identified as sites by our model. To do this, all sites that had an information content (46) <0 bits were removed, and the model was rebuilt. The zero-bit cutoff was used because it represents a version of the second law of thermodynamics: sites with positive information correspond to negative ΔG of binding (45,46). This approach was successfully used for constructing ribosome (34) and splice site models (47). Removing and rebuilding was continued until no negative sites remained in the set. This reduced our number of sites from 620 to 559. The refined multiple alignment gave us a well conserved −10 region (Figure 1
We do not adhere to the conventional numbering system used in describing the distance between the −10 and the −35. The conventional numbering of the spacer is the number of bases between the two hexamers (10). That is, ttgacaNNNtataat would have a spacing of three. Since the convention for position numbering in asymmetric sequence logos is to choose a strongly conserved base, we will refer to the second base in each hexamer as zero, and all spacings will be reported as the difference between those coordinates. Therefore, all values for our spacing are 6 bases greater than the numbering used previously. For example, tTgacannntAtaat would have a spacing of nine (the difference between the capital T in the −35 hexamer and the capital A in the −10 hexamer), rather than three. Therefore, the classical spacing of 17 bases is 23 in our notation. The only way we would be able to adhere to the conventional numbering system would be to assign a base outside one of the hexamers as the zero coordinate. This would be confusing in sequence analysis using sequence walkers (48), (http://www.ccrnp.ncifcrf.gov/~toms/paper/walker/). Furthermore, each sequence walker always has the integer zero in its coordinate system so that one can easily and unambiguously locate a binding site and then specifically identify bases within the binding site. Aligning the −35 was more difficult than aligning the −10. The traditional model of RNA polymerase binding only allows for three different spacings between the −35 and the −10 (11). Mutational data have shown that this range could be expanded to six positions (10), but that at these expanded spacings the amount of transcription is reduced substantially. The optimal spacing of 23 bp (McClure's spacing of 17 bp) (11) places both hexamers on the same face of the DNA within their respective major grooves (23), indicating that the spacing may be dependent upon DNA structure. Therefore, a new alignment method is needed that takes into account the structure of the DNA. The algorithm for aligning the −35 is different from our previous approach of aligning upstream sequences in flexible models [i.e. modeling the Shine–Dalgarno relative to the initiation codon in ribosome binding sites (34)]. As with ribosomes, our approach was to create a model de novo from experimental data, so as to avoid biases in previous models. Using a sequence logo (49) (http://www.ccrnp.ncifcrf.gov/~toms/paper/logopaper/), we observed a weak conservation upstream of the aligned −10s in the region expected for the −35, 23 bp upstream. We determined that the conservation of this region was low because a number of sites with a different spacing were overlapping, reducing the total sequence conservation. We performed a cyclic refinement of the region which corresponds to the −35 hexamer at the optimal spacing from the −10 in order to pull out a preliminary −35 model. This cyclic refinement gave a reasonably well conserved −35 sequence logo, which matched the conventional hexamer consensus (11,50) (http://www.ccrnp.ncifcrf.gov/~toms/papers/zen), and this alignment was used for an initial model. After refinement, we used malign to allow for the sites to be moved 1 base in either direction, so as to maximize the information in this refined −35. Using the program multiscan, the initial −35 model was scanned over the region upstream of the −10 of every promoter in order to find the −35 which most closely matched this model for each site. Of the 559 promoters scanned, 421 had a −35 site >0 bits in the range of 21–26 bases upstream of the −10 [this corresponds to McClure's spacing of 15–20 bases (10)]. The alignment having the strongest site was used, and the total site strength was calculated using the flexible information equation described previously (34):
Once we had identified the −35, the −10 and the spacing for each promoter, we dispensed with the ‘scaffolding’ equations described above and built a flexible model directly from the sequence data. We did a further cyclic refinement on this set by removing promoters with a flexible information <0 bits, reducing the number of promoters from 421 to 401. Our final model, therefore, contains 59% of the sites that are in our original database. Transcriptional regulators can provide informational contacts through the αCTD and this could account for some of the information used by polymerases (52). As a result, many promoters may have poorly conserved, or highly variable, σ binding sites. The refinement process made the model self-consistent (containing similar sites) and it can therefore be regarded as a basal promoter model. The excluded sites are not consistent with this basal model, and presumably they initiate by some method other than the sole recognition of the −10 and −35 (such as an extended −10 or activation by another protein). Promoter binding and transcriptional regulation is more complex than our previous flexible modeling system was able to handle because of the contribution of activation proteins (52). Therefore, we created an algorithm that not only considers the strength of the two-part σ70 site, but also can include the information contributed by activating proteins. In order to do this we used the multiscan algorithm. Multiscan algorithm Multiscan is an extension of the biscan program that is used to model the flexible prokaryotic ribosome (34). Translational initiation in prokaryotes requires contact at both the P site (or initiation region, IR) and the Shine–Dalgarno (SD). Because of the flexibility of the ribosome, these contacts can occur at different spacings, anywhere between 4 and 18 bases. In order to assess the information present in ribosome binding sites, the contributions of the Shine–Dalgarno, the initiation region and the spacing between them all have to be considered. The equation for calculating the information for a two-part model with variable spacing was given in Equation (1). The RNA polymerase is similar to the ribosome in that it makes two contacts (the −10 and the−35) with some variable distance between them. Therefore, the flexible information analysis used with ribosomes can also be used to describe the binding of the σ factor to the promoter. The difference between translational and transcriptional initiation is that auxiliary proteins can also bind to either activate or repress transcription. So in order to model the promoter correctly, we need to calculate the information of all the molecules that contribute or interfere. For activators, as an initial simple model, we assume that their information contributes additively to the total information of the promoter (53), so the new equation is as follows:
This algorithm only includes the activator site and corresponding gap surprisal if they contribute positive information, since that corresponds to favorable binding (46). In addition, the number of potential activators is limited only by the length of the sequence. That is, if multiple activators bind in a range relative to the RNA polymerase that has been observed to be advantageous to transcription initiation, then they are all included into the total information for the site. At present, the algorithm does not account for the possibility of repression of one activator by another (54). Although it seems reasonable to assume that activator protein information can be scaled by ρAct and added to the total information of the promoter, it is not clear that repressor information should be subtracted. Since repressors block the binding of the polymerase to the DNA, or, in cases such as GalR, cause DNA loops that block binding (55), they do not decrease the strength of the contact but, if present, totally prevent contact from occurring. Therefore, it does not matter what the strength of the repressor is, because a repressor bound to a 1 bit site will prevent initiation as well as a repressor bound to a 10 bit site. The difference between the two is that the 10 bit site will be bound more frequently, so the relative site strengths between the polymerase and the repressor (as well as the concentration of both molecules) can be used to predict the frequency of transcription, but not the ability of the polymerase to bind. Promoter analysis using the σ70 model Sequence logos for promoter components were made using the programs delila, alist, encode, rseq, dalvec and makelogo as described previously (49,51). We used the programs diffinst, genhis and genpic to generate the spacing distribution between the binding components. During our refinement process, we identified 138 experimentally verified promoters that did not have an upstream −35. We used this subset to build the extended −10 model (Figure 3
To determine spacings between the −10 and the translational initiation codon (Figure 4
In order to analyze individual sequences using our σ70 model and a transcriptional regulator, we used sequence walker technology (46,48,57). Flexible sites were located using multiscan and displayed as sequence walkers using lister. For Figures 5
To identify these novel control elements, we scanned the entire genome for Fur sites that overlapped the σ70 binding sites within 200 bases of translational start points. These two were chosen because of the strength of both the Fur and σ70 sites, and their proximity to each other. Gel shifts confirmed that these sites are bound by Fur (data not shown). Finally, the relative binding plots (Figure 8
To compute the intergenic density distributions (Figure 8 RESULTS The σ70 model Our dataset for σ70 promoters consisted of both the RegulonDb and the PromEC databases (42,43). Unlike eukaryotic start points, which contain ~3 bits of information (60), there did not appear to be much information at this prokaryotic transcription start point, only 0.39 ± 0.06 bits (Figure 1 We were quite easily able to align the −10 relative to the transcription start points (Figure 1 The most striking feature of the −10 logo is the strongly conserved T (position +4) where the protein is likely to face the minor groove of the DNA. In other logos for DNA-binding proteins, conservation of bases rarely exceeds 1 bit in the minor groove (62), because in B-form DNA the exposed groups in the minor groove can only be used to distinguish A or T from C or G, but not all four bases individually (63). High conservation in the minor groove suggests that this base is being contacted atypically. Several possibilities are that the helix is distorted when bound, it is interacting with σ70 in the open complex (64), or that it is being flipped out of the helix to initiate open complex formation, as may occur in DNA replication (41). We had greater difficulty aligning the −35 perhaps because the −35 is often replaced by activators (1,52). Traditionally, the placement of the −35 relative to the −10 is ±1 base relative to the most frequent position of 23 bases [McClure's 17 (11), see Materials and Methods]. Experimental data have shown initiation at a range of −2 to +3 relative to the most frequent position (10,19,22). When we allowed for our model to include sites in this expanded spacing, we identified 107 promoters (>0 bits) that had no possible −35 in the traditional range, suggesting that binding does occur at these peripheral spacings. The amount of conservation in the −35, as in the −10, was low compared with other DNA-binding proteins (37) (4.02 ± 0.09 bits). The conserved region of the −35 appears to fill only one-half of the major groove, as there is an abrupt termination of conservation on the 5′ edge (Figure 1 We allowed for the spacing range between the two hexamers to be between 21 and 26 bases. The optimal spacing between the zero coordinates of the −35 and −10 was 23 bases. The spacing distribution appeared to be approximately Gaussian with an uncertainty of 2.32 ± 0.04 bits. The total sequence conservation (Rsequence) for the σ70 model (−35, gap, −10) is 6.48 ± 0.14 bits (Figure 1 To see how the σ70 promoter varies with spacing, logos were made for each of the spacing classes (Figure 2
The conservation of the −35 at each spacing in Figure 2 We looked at the −10 as a function of spacing relative to the transcription start point (data not shown). There was a slight increase in the conservation of the 5′ T of the −10 hexamer at greater spacings. Besides that, there was little variability in the logos of the different spacing classes. The amount of information at the transcription start point was small (~0.4 bits) for all spacings, and the slight variability between them could be accounted for by noise. To avoid duplicate sites, we had excluded 64 promoters from our dataset which were within 15 bases of another transcript start (see Materials and Methods). Having built the model, we went back and scanned it over these regions to see if we could predict promoters upstream of more complex overlapping transcripts (data not shown). In 25 cases, for every experimentally determined transcriptional start point there were one or more distinct predicted promoters. In 17 of the 64 cases there was only one predicted promoter and both transcripts fell within the known distance distribution of 8–14 bases downstream from the −10 (Figure 1 To further verify our promoter model, we scanned it over the starts of 36 small RNAs presented by Hershberg et al. (66) (data not shown). The model identified promoters upstream of 23 of the 36 starts. That is, ~64% of the sites had a promoter with a total information >0 bits from 8 to 14 bases (Figure 1 The conservation of bases that we observed in our model resembled previous non-information theory based alignments (11), mutation data (10,11,20), an in vivo selection assay (67), and the −10 sequence logo published previously for a smaller dataset (41). The mutation data of Moyle et al. (68) had a 0.6 correlation coefficient to the predicted individual information for our complete promoter model (data not shown). These results are consistent with observations by Mirny and Gelfand (69) who demonstrated a good correlation between sequence conservation and the number of base contacts a protein makes with DNA. Creating a model for the −35 was difficult, presumably because many promoters are activated and the activator could take over the sequence conservation from the −35, as proposed by Raibaud and Schwartz (52). To test this hypothesis we scanned the 14 sequences in the E. coli genome reported to be positively activated by Raibaud and Schwartz and determined which −35 was strongest in the 10 bp window they allowed. In contrast to the 4.0 ± 0.1 bits in our −35 model, these −35 sequences in activated promoters were no more than 1 ± 4 bits. The weak conservation of positively activated sites probably does explain why creating a −35 model is difficult. To our knowledge this is the only published dataset of confirmed positively activated E. coli promoters. The extended minus 10 Initiation has been shown to occur in the absence of a −35 in conjunction with an extension to the −10 region (13,14). During our refinement process, a subset of 138 promoters did not have a predicted −35 binding site, and these were subsequently removed from the flexible basal model. A sequence logo revealed that the removed subset of promoters contained a weakly conserved TG upstream of the −10 hexamer, in the region identified as the extended −10. We therefore did a cyclic refinement of the two bases containing the weakly conserved TG, and a well-conserved extended −10 emerged (Figure 3 The relationship between the promoter and the ribosome binding site In order to determine if there are any spacing preferences between the zero coordinate of the −10 and the translational initiation codon, we scanned our σ70 model upstream of all 4122 annotated genes in E. coli (56). We saw one substantial peak in the spacing histogram of predicted promoters, ~30–40 bases upstream of the ATG (Figure 4 For all 401 promoters in our model, we did not see any correlation between promoter strength and the strength of its downstream ribosome binding site as measured by the individual information contents of each flexible model (data not shown). We did find that the lowest combined individual information of a promoter and an RBS was 3.49 bits. Transcriptional regulation We used the σ70 model in conjunction with transcriptional regulator models to study promoter structures in non-basal conditions. As an example, we show the experimentally verified Fur-controlled gene tonB (71) and the degree to which Fur represses it (Figure 5 As a second example, we used the transcriptional regulator Lrp (40), which can both activate and repress transcription in E. coli (Figure 6 Since there are at least seven identified transcription start points for the dad operon (74,75), we used our model to see if we could identify the corresponding promoters. The three most downstream starts marked in Figure 6a Interestingly, the computed strength of the dad promoters increases as they get closer to the gene start point but this effect is not observed in arcA (data not shown), which also has seven verified transcript starts (70). Two bound Lrp molecules (11.1 and 11.7 bits) could block the binding and initiation of the two downstream dad promoters and four subsequent downstream transcripts (3, 4, 5, 7), and possibly prevent the opening of transcript 6. The most downstream promoter (transcript 7) is 38 bases away from the translational start point, which produces a transcript only a few bases longer than needed to contain the conserved Shine–Dalgarno region (34), showing an optimization of cellular resources by minimizing the length of mRNAs. We predicted Lrp binding in the region protected by the upstream footprint, but the site was relatively weak at 1.5 bits (data not shown). Lrp activation of the gltBDF (76) operon can also be predicted using sequence walkers (Figure 6b Based on the two examples in Figure 6 Besides being able to dissect well-understood genetic control systems, we would like to be able to predict new ones for testing. Using a Fur model and the flexible sigma model, we identified a number of potential Fur repressed genes in the E. coli genome. We report two of these cases here (Figure 7 Where DNA bending proteins bind relative to promoter components We searched for the relative placement of transcriptional regulators near the starts of all of the promoters in our model (Figure 8 We determined the range of non-coding sequences surrounding each promoter component in our dataset of 401 promoters (Figure 1 Previous analysis by Robison et al. had also identified a preference for genetic control elements to bind in intergenic regions (59), but that analysis was not done in reference to the alignment of promoter components. DISCUSSION Genetic control systems often consist of multiple binding components with variable distances between them. These variable distances can affect the stability of the binding complex. Our approach is to use experimentally demonstrated binding sites to construct models. Unlike neural networks, this approach avoids the assumption that untested stretches of nucleotide sequence do not contain binding sites, and it sets the model upon a firm foundation. Our model construction uses information theory, which not only allows measurements of the patterns at the binding sites, but can also account for distance preferences on the same quantitative and universal scale of bits (37). We previously used a flexible modeling method for ribosome binding sites (34). That successful application suggested that information theory can be applied to any multi-part binding system where binding is affected by the spacing between components. In this paper we show that the same approach works well to quantify prokaryotic promoters, which have two binding components at approximately −10 and −35 bases from the start of transcription (11). A σ70 model based on information theory The amount of sequence conservation in the −10 and in the −35 is fairly low, ~5 and 4 bits, respectively. As is found for most DNA-binding proteins (41,62), both sequence logos appear to follow a sine wave, which represents the 10.6 base helical twist of B-form DNA (Figure 1 We used the T. aquaticus σA/−35 co-crystal (65) to determine the location of where the σ protein faces the major groove with respect to the−35. Using the average gap distance, this assignment places the major groove of the T–A base pair at position +4 of the −10 on exactly the opposite face of the DNA as the −35 (Figure 1 In addition, with the exception of +4, the pattern of sequence conservation of the extended −10 follows the sine wave (Figure 3 Sclavi et al. used hydroxy radical footprinting to look at intermediates in open complex formation (83). They observed that protection at position 0 (−11 in conventional numbering) occurs after protection in the region of +3 to +5 (−8 to −6 in conventional numbering). This is consistent with the T at +4 (−7 in conventional numbering) initiating DNA melting through a base flipping mechanism (41), which would explain why this position appears anomalous in the sequence logo. In contrast, it has been proposed that flipping of the A at position −11 in the minus ten (our number 0) initiates DNA melting to form the open complex (84–90). If this is the case, why is our groove assignment 5 bases (180°) different? One possibility is that the DNA helix could be distorted between the −35 and the −10, which our model does not account for. However, Young et al. showed that a small part of the σ factor and the first 314 amino acids of the β subunit are sufficient to initiate promoter melting, which probably excludes DNA bending or twisting (91). Furthermore, a co-crystal structure of a fork-junction DNA bound to a holoenzyme (7) shows smoothly bent B-form DNA from the −35 to just before the −10. In this structure the extended −10 is contacted in the major groove, consistent with Figure 3 The amount of conservation in the −35 is fairly weak. The clear absence of sequence conservation in the major groove immediately upstream of the −35 could leave room for activating proteins to bind and to stabilize the polymerase [this is supported by the σA/−35 co-crystal structure (65)]. By interacting with the polymerase near the −35 contact, accessory molecules could make σ70 a much more discriminate binder. Penotti found that the distance between the human TATA sites and the transcriptional start is variable, with an uncertainty of ~3 bits (60). He also observed that there are ~3 bits of information at the start point itself. In other words, the information of the start (Rsequence) is just sufficient for it to be located with respect to the TATA (Rfrequency), which is the smallest known example of this evolutionary principle (37,92) (http://www.ccrnp.ncifcrf.gov/~toms/paper/ev/). Unlike eukaryotic transcription starts, there is a low conservation of bases at the transcription start point of E. coli (0.39 ± 0.06 bits). Because the average gap surprisal between the −10 and the transcriptional start (2.56 ± 0.04 bits) exceeds the information at the start point, we propose that the determination of which base to begin polymerization is influenced more by the detailed path of the RNA through the open complex (7), than by the actual base at the start. The conventional spacing allowed between the −10 and the −35 only varies by 3 bases (11), but to account for experimental data (10,17,18), we allowed six bases. Most promoters do fall into the traditional three spacing classes, but binding at further spacings seems experimentally and statistically (Figure 2 The overall low information content of the entire σ70 binding site suggests that the RNA polymerase binds frequently along the genome (70,93). With a total information of 6.48 ± 0.14 bits, σ70 would bind approximately once in every 90 bases in random equiprobable DNA. This would lead to 10 times more transcripts than genes in E. coli. The promiscuous nature of the polymerase may be necessary to allow transcription of many different genes in the genome. The polymerase must bind independently of gene function, so it must be indiscriminate enough to bind to a variety of control regions. This suggests that transcription is frequently influenced not only by the strength of the sigma binding site, but also by regulatory molecules. It is also possible that many small RNAs are generated, as has been discovered recently [66,93,94]. In fragments from E. coli with lengths of 163 ± 24 bp, Kawano et al. found 0.76 promoters in one orientation (93). From this we compute Rfrequency = −log2(2 × 0.76/163 ± 24) = 6.7 ± 0.2 bits per site. This is remarkably close to the value for our model, Rsequence = 6.48 ± 0.14 bits per site, and it shows that, as with other genetic systems, the information in the binding sites is sufficient to locate the sites in the genome (37,92). This quantitatively demonstrates that information theory provides a reasonable basal model of polymerase binding. Evolutionary implications of the extended minus 10 Surprisingly, we were easily able to isolate 84 promoters that lack a −35 and exhibit an extended −10 (13,14). The information content of extended −10 promoters is almost identical to the information content of the entire σ70 model (6.7 and 6.5 bits, respectively). That is, the information contribution of the −35 hexamer in the flexible promoter (Figure 1 There is a correlation between the amount of conservation within binding sites (the average information content or Rsequence) and the amount of information needed to locate binding sites in the genome (37,92). Also, it appears that the information of a site (or group of sites) relates to the energetics of the system (45). Therefore, since these two promoter classes have a similar information content, we assume that they are equally able to be identified in the genome and to stabilize the polymerase. A single binding element, such as the extended −10, is a much simpler machine to evolve than a two-part flexible binder. The bacteriophage T7 RNA polymerase (95) has only one binding element (96), so having two widely separated parts is not essential for transcription. Therefore, we suggest that in prokaryotes the extended −10 may be an evolutionary predecessor to the modern bipartite promoter. Another possibility is that the bipartite promoter is the evolutionary predecessor of the extended −10, but this does not explain the origin of bipartite promoters. Although they have a similar amount of information, the single-element promoter (Figure 3 An advantage of having two widely separated binding components may be to increase promoter strength disparities through interactions with transcriptional regulators. By having a larger binding region, there are more spatial opportunities for accessory proteins to affect the initiation complex. As shown in Figures 5 Why would the cell evolve a flexible bipartite binding mechanism? A possible explanation could be that this mechanism allows a polymerase bound to the promoter to sense genomic structure. Indeed, transcriptional initiation has been observed to vary with the superhelicity of the DNA (22,25,98,99). These differences in the rate of transcription could be from differences in the meltability of the promoter or the stability of the closed complex (11,22). Also, the spacing between the −10 and −35 is large, two helical turns of DNA, which increases polymerase sensitivity to the overall structure, since twist or bending effects are amplified over larger distances. Twist and bending strain could affect polymerase contacts at both the −10 and −35, as shown in Figure 2 The two extra bases of the extended −10 are similar to positions 0 and +1 of the −35 (Figures 1 In comparison to the E. coli σ70 and σ32, both of which appear to contain two helix–turn–helix DNA-binding domains, the σ55 factor produced by bacteriophage T4 for late transcription, contains only one helix–turn–helix motif (1). Correspondingly, the T4 σ55 only recognizes a −10 region which contains about 16.2 bits of information (61). This is close to the 17.6 bits required to locate the 50 known late promoters in the E. coli genome (37). A pared-down RNA polymerase is able to recognize and open an extended −10 (91). These observations are consistent with the hypothesis that −10 recognition evolved first, followed by appearance of the −35. The relationship between the promoter and the ribosome binding site The information content of the flexible ribosome binding site (34) is greater than the σ70 model, 9.28 ± 0.06 versus 6.48 ± 0.14 bits, respectively. This most simply suggests that there is often more than one promoter per coding region in the cell as, for example, shown in Figure 6 When we scanned our promoter model upstream of all annotated genes in E. coli (56), our model frequently identified sites at a spacing of ~35 bases between the zero coordinate of the −10 and the first base of the start codon (Figure 4 Transcriptional regulation Our original dataset of experimentally verified transcription start points was larger than the number of sites in our final model (684 versus 401). The cyclic refinement that removed sites focused the original group down by selecting a subset that is coherent. The excluded sites were weak (information content is <0 bits) compared to the retained subset and are therefore presumably activated or use a different sigma factor. Also, as previously noted, the average information of unregulated promoters is fairly low (6.48 ± 0.14), implying that the polymerase binds frequently throughout the genome. These observations are consistent with the role of regulatory proteins to help stabilize weak promoters. Intergenic regions have a composition that is different from coding regions, and protein-binding domains have evolved to bind the intergenic regions (59). As shown in Figure 4 The −35 is the most upstream component of the promoter, and it is closest to the αCTD, to which activator proteins bind (12,53). This explains why the alignments shown in Figure 8 Without DNA bending, activators more than 20 bases upstream would have difficulty binding to the αCTD (53), because at distances shorter than the persistence length (150–200 bp) DNA is like a rigid rod (102). Furthermore, pairs of DNA sites come together most easily when they are ~300 bp apart (103,104). These physical limits mean that activators would be restricted to be either immediately upstream of the polymerase, as noted previously (105), or at least 300 bases away. DNA bending proteins and curved DNA, which is in the intergenic regions (79,106), loosen these restrictions and allow activators bound within 300 bases upstream of the promoter to function. This explains why Fis, H-NS and IHF are often found within 300 bases of the promoter (Figure 8 The number of Fis dimers in the cell has been shown to drastically increase in response to nutritional upshifts (107). If a major role of Fis in the cell is to facilitate activation through DNA-bending within the persistence length, then these results show how Fis can act as a powerful global regulator, linking transcription to cellular nutrition (39,79,108). Our individual information analysis (Figures 5 SUPPLEMENTARY DATA Supplementary Data are available at NAR Online. [Supplementary Material]
Acknowledgments We would like to thank Dmitry Vassylyev for providing atomic coordinates for the closed promoter and Heladia Salgado for proving us with the RegulonDb database. We would also like to thank Brent Jewett, Danielle Needle, Michael Levashov, Aidan Ryan and Pete Rogan for their comments. This research was supported in part by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research. Funding to pay the Open Access publication charges for this article was provided by NCI. Conflict of interest statement. None declared. REFERENCES 1. Horwitz M.S., Loeb L.A. Structure–function relationships in Escherichia coli promoter DNA. Prog. Nucleic Acid Res. Mol. Biol. 1990;38:137–164. [PubMed] 2. Gralla J.D. Promoter recognition and mRNA initiation by Escherichia coli E σ70. Methods Enzymol. 1990;185:37–54. [PubMed] 3. deHaseth P.L., Zupancic M.L., Record M.T., Jr RNA polymerase–promoter interactions: the comings and goings of RNA polymerase. J. Bacteriol. 1998;180:3019–3025. [PubMed] 4. Browning D.F., Busby S.J. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2004;2:57–65. [PubMed] 5. Burgess R.R., Travers A.A., Dunn J.J., Bautz E.K. Factor stimulating transcription by RNA polymerase. Nature. 1969;221:43–46. [PubMed] 6. Gross C.A., Chan C., Dombroski A., Gruber T., Sharp M., Tupy J., Young B. The functional and regulatory roles of sigma factors in transcription. Cold Spring Harb. Symp. Quant. Biol. 1998;63:141–155. [PubMed] 7. Murakami K.S., Masuda S., Campbell E.A., Muzzin O., Darst S.A. Structural basis of transcription initiation: an RNA polymerase holoenzyme–DNA complex. Science. 2002;296:1285–1290. [PubMed] 8. Young B.A., Gruber T.M., Gross C.A. Views of transcription initiation. Cell. 2002;109:417–420. [PubMed] 9. Gross C., Lonetto M., Losick R. Bacterial sigma factors. In: McKnight S.L., Yamamoto K.R., editors. Transcriptional Regulation. New York: Cold Spring Harbor Laboratory Press; 1992. pp. 129–176. 10. Hawley D.K., McClure W.R. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 1983;11:2237–2255. [PubMed] 11. McClure W.R. Mechanism and control of transcription initiation in prokaryotes. Annu. Rev. Biochem. 1985;54:171–204. [PubMed] 12. Benoff B., Yang H., Lawson C.L., Parkinson G., Liu J., Blatter E., Ebright Y.W., Berman H.M., Ebright R.H. Structural basis of transcription activation: the CAP–αCTD–DNA complex. Science. 2002;297:1562–1566. [PubMed] 13. Keilty S., Rosenberg M. Constitutive function of a positively regulated promoter reveals new sequences essential for activity. J. Biol. Chem. 1987;262:6389–6395. [PubMed] 14. Barne K.A., Bown J.A., Busby S.J., Minchin S.D. Region 2.5 of the Escherichia coli RNA polymerase σ70 subunit is responsible for the recognition of the ‘extended -10’ motif at promoters. EMBO J. 1997;16:4034–4040. [PubMed] 15. Kumar A., Malloch R.A., Fujita N., Smillie D.A., Ishihama A., Hayward R.S. The minus 35-recognition region of Escherichia coli sigma 70 is inessential for initiation of transcription at an ‘extended minus 10’ promoter. J. Mol. Biol. 1993;232:406–418. [PubMed] 16. Eichenberger P., Dethiollaz S., Buc H., Geiselmann J. Structural kinetics of transcription activation at the malT promoter of Escherichia coli by UV laser footprinting. Proc. Natl Acad. Sci. USA. 1997;94:9022–9027. [PubMed] 17. Mandecki W., Reznikoff W.S. A lac promotor with a changed distance between -10 and -35 regions. Nucleic Acids Res. 1982;10:903–912. [PubMed] 18. Aoyama T., Takanami M., Ohtsuka E., Taniyama Y., Marumoto R., Sato H., Ikehara M. Essential structure of E. coli promoter: effect of spacer length between the two consensus sequences on promoter function. Nucleic Acids Res. 1983;11:5855–5864. [PubMed] 19. Dombroski A.J., Johnson B.D., Lonetto M., Gross C.A. The sigma subunit of Escherichia coli RNA polymerase senses promoter spacing. Proc. Natl Acad. Sci. USA. 1996;93:8858–8862. [PubMed] 20. Stefano J.E., Gralla J.D. Mutation-induced changes in RNA polymerase–lac ps promoter interactions. J. Biol. Chem. 1982;257:13924–13929. [PubMed] 21. Borowiec J.A., Gralla J.D. All three elements of the lac pS promoter mediate its transcriptional response to DNA supercoiling. J. Mol. Biol. 1987;195:89–97. [PubMed] 22. Aoyama T., Takanami M. Supercoiling response of E. coli promoters with different spacer lengths. Biochim. Biophys. Acta. 1988;949:311–317. [PubMed] 23. Vassylyev D.G., Sekine S., Laptenko O., Lee J., Vassylyeva M.N., Borukhov S., Yokoyama S. Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 Å resolution. Nature. 2002;417:712–719. [PubMed] 24. Lim H.M., Lewis D.E., Lee H.J., Liu M., Adhya S. Effect of varying the supercoiling of DNA on transcription and its regulation. Biochemistry. 2003;42:10718–10725. [PubMed] 25. Peter B.J., Arsuaga J., Breier A.M., Khodursky A.B., Brown P.O., Cozzarelli N.R. Genomic transcriptional response to loss of chromosomal supercoiling in Escherichia coli. Genome Biol. 2004;5:R87. [PubMed] 26. Lukashin A.V., Anshelevich V.V., Amirikyan B.R., Gragerov A.I., Frank-Kamenetskii M.D. Neural network models for promoter recognition. J. Biomol. Struct. Dyn. 1989;6:1123–1133. [PubMed] 27. Weller K., Recknagel R.D. Promoter strength prediction based on occurrence frequencies of consensus patterns. J. Theor. Biol. 1994;171:355–359. [PubMed] 28. GuhaThakurta D., Stormo G.D. Identifying target sites for cooperatively binding factors. Bioinformatics. 2001;17:608–621. [PubMed] 29. Galas D.J., Eggert M., Waterman M.S. Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. J. Mol. Biol. 1985;186:117–128. [PubMed] 30. Mulligan M.E., Hawley D.K., Entriken R., McClure W.R. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity. Nucleic Acids Res. 1984;12:789–800. [PubMed] 31. Hertz G.Z., Stormo G.D. Escherichia coli promoter sequences: analysis and prediction. Methods Enzymol. 1996;273:30–42. [PubMed] 32. O'Neill M.C. Consensus methods for finding and ranking DNA binding sites: application to Escherichia coli promoters. J. Mol. Biol. 1989;207:301–310. [PubMed] 33. Harley C.B., Reynolds R.P. Analysis of E. coli promoter sequences. Nucleic Acids Res. 1987;15:2343–2361. [PubMed] 34. Shultzaberger R.K., Bucheimer R.E., Rudd K.E., Schneider T.D. Anatomy of Escherichia coli ribosome binding sites. J. Mol. Biol. 2001;313:215–228. [PubMed] 35. Shannon C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948;27:379–423. 623–656. 36. Pierce J.R. An Introduction to Information Theory: Symbols, Signals and Noise. NY: Dover Publications, Inc.; 1980. 37. Schneider T.D., Stormo G.D., Gold L., Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J. Mol. Biol. 1986;188:415–431. [PubMed] 38. Rogan P.K., Faux B.M., Schneider T.D. Information analysis of human splice site mutations. Hum. Mutat. 1998;12:153–171. [PubMed] 39. Hengen P.N., Bartram S.L., Stewart L.E., Schneider T.D. Information analysis of Fis binding sites. Nucleic Acids Res. 1997;25:4994–5002. [PubMed] 40. Shultzaberger R.K., Schneider T.D. Using sequence logos and information analysis of Lrp DNA binding sites to investigate discrepancies between natural selection and SELEX. Nucleic Acids Res. 1999;27:882–887. [PubMed] 41. Schneider T.D. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation. Nucleic Acids Res. 2001;29:4881–4891. [PubMed] 42. Salgado H., Santos-Zavaleta A., Gama-Castro S., Millan-Zarate D., Diaz-Peredo E., Sanchez-Solano F., Perez-Rueda E., Bonavides-Martinez C., Collado-Vides J. RegulonDB (version 3.2): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res. 2001;29:72–74. [PubMed] 43. Hershberg R., Bejerano G., Santos-Zavaleta A., Margalit H. PromEC: an updated database of Escherichia coli mRNA promoters with experimentally identified transcriptional start sites. Nucleic Acids Res. 2001;29:277. [PubMed] 44. Schneider T.D., Mastronarde D. Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method. Discrete Appl. Math. 1996;71:259–268. 45. Schneider T.D. Theory of molecular machines II. Energy dissipation from molecular machines. J. Theor. Biol. 1991;148:125–137. [PubMed] 46. Schneider T.D. Information content of individual genetic sequences. J. Theor. Biol. 1997;189:427–441. [PubMed] 47. Rogan P.K., Svojanovsky S., Leeder J.S. Information theory-based analysis of CYP2C19, CYP2D6 and CYP3A5 splicing mutations. Pharmacogenetics. 2003;13:207–218. [PubMed] 48. Schneider T.D. Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences [Erratum (1998) Nucleic Acids Res., 26, 1135.]. Nucleic Acids Res. 1997;25:4408–4415. [PubMed] 49. Schneider T.D., Stephens R.M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990;18:6097–6100. [PubMed] 50. Schneider T.D. Consensus sequence zen. Appl. Bioinformatics. 2002;1:111–119. [PubMed] 51. Schneider T.D. Reading of DNA sequence logos: prediction of major groove binding by information theory. Methods Enzymol. 1996;274:445–455. [PubMed] 52. Raibaud O., Schwartz M. Positive control of transcription initiation in bacteria. Annu. Rev. Genet. 1984;18:173–206. [PubMed] 53. Busby S., Ebright R.H. Transcription activation by catabolite activator protein (CAP). J. Mol. Biol. 1999;293:199–213. [PubMed] 54. Hengen P.N., Lyakhov I.G., Stewart L.E., Schneider T.D. Molecular flip-flops formed by overlapping Fis sites. Nucleic Acids Res. 2003;31:6663–6673. [PubMed] 55. Semsey S., Virnik K., Adhya S. Three-stage regulation of the amphibolic gal operon: from repressosome to GalR-free DNA. J. Mol. Biol. 2006;358:355–363. [PubMed] 56. Rudd K.E. EcoGene: a genome sequence database for Escherichia coli K-12. Nucleic Acids Res. 2000;28:60–64. [PubMed] 57. Schneider T.D., Rogan P.K. 1999. Computational analysis of nucleic acid information defines binding sites. United States Patent 5867402. 58. Goodrich J.A., Schwartz M.L., McClure W.R. Searching for and predicting the activity of sites for DNA binding proteins: compilation and analysis of the binding sites for Escherichia coli integration host factor (IHF). Nucleic Acids Res. 1990;18:4993–5000. [PubMed] 59. Robison K., McGuire A.M., Church G.M. A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J. Mol. Biol. 1998;284:241–254. [PubMed] 60. Penotti F.E. Human DNA TATA boxes and transcription initiation sites: A statistical study. J. Mol. Biol. 1990;213:37–52. [PubMed] 61. Miller E.S., Kutter E., Mosig G., Arisaka F., Kunisawa T., Ruger W. Bacteriophage T4 genome. Microbiol. Mol. Biol. Rev. 2003;67:86–156. [PubMed] 62. Papp P.P., Chattoraj D.K., Schneider T.D. Information analysis of sequences that bind the replication initiator RepA. J. Mol. Biol. 1993;233:219–230. [PubMed] 63. Seeman N.C., Rosenberg J.M., Rich A. Sequence-specific recognition of double helical nucleic acids by proteins. Proc. Natl Acad. Sci. USA. 1976;73:804–808. [PubMed] 64. Brodolin K., Zenkin N., Severinov K. Remodeling of the sigma70 subunit non-template DNA strand contacts during the final step of transcription initiation. J. Mol. Biol. 2005;350:930–937. [PubMed] 65. Campbell E.A., Muzzin O., Chlenov M., Sun J.L., Olson C.A., Weinman O., Trester-Zedlitz M.L., Darst S.A. Structure of the bacterial RNA polymerase promoter specificity σ subunit. Mol. Cell. 2002;9:527–539. [PubMed] 66. Hershberg R., Altuvia S., Margalit H. A survey of small RNA-encoding genes in Escherichia coli. Nucleic Acids Res. 2003;31:1813–1820. [PubMed] 67. Oliphant A.R., Struhl K. Defining the consensus sequences of E. coli promoter elements by random selection. Nucleic Acids Res. 1988;16:7673–7683. [PubMed] 68. Moyle H., Waldburger C., Susskind M.M. Hierarchies of base pair preferences in the P22 ant promoter. J. Bacteriol. 1991;173:1944–1950. [PubMed] 69. Mirny L.A., Gelfand M.S. Structural analysis of conserved base pairs in protein–DNA complexes. Nucleic Acids Res. 2002;30:1704–1711. [PubMed] 70. Huerta A.M., Collado-Vides J. Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals. J. Mol. Biol. 2003;333:261–278. [PubMed] 71. Young G.M., Postle K. Repression of tonB transcription during anaerobic growth requires Fur binding at the promoter and a second factor binding upstream. Mol. Microbiol. 1994;11:943–954. [PubMed] 72. Postle K., Good R.F. DNA sequence of the Escherichia coli tonB gene. Proc. Natl Acad. Sci. USA. 1983;80:5235–5239. [PubMed] 73. Zhi J., Mathew E., Freundlich M. Lrp binds to two regions in the dadAX promoter region of Escherichia coli to repress and activate transcription directly. Mol. Microbiol. 1999;32:29–40. [PubMed] 74. Mathew E., Zhi J., Freundlich M. Lrp is a direct repressor of the dad operon in Escherichia coli. J. Bacteriol. 1996;178:7234–7240. [PubMed] 75. Zhi J., Mathew E., Freundlich M. In vitro and in vivo characterization of three major dadAX promoters in Escherichia coli that are regulated by cyclic AMP-CRP and Lrp. Mol. Gen. Genet. 1998;258:442–447. [PubMed] 76. Wiese D.E.,II, Ernsting B.R., Blumenthal R.M., Matthews R.G. A nucleoprotein activation complex between the leucine-responsive regulatory protein and DNA upstream of the gltBDF operon in Escherichia coli. J. Mol. Biol. 1997;270:152–168. [PubMed] 77. Tsolis R.M., Baumler A.J., Stojiljkovic I., Heffron F. Fur regulon of Salmonella typhimurium: identification of new iron-regulated genes. J. Bacteriol. 1995;177:4628–4637. [PubMed] 78. Masse E., Vanderpool C.K., Gottesman S. Effect of RyhB small RNA on global iron use in Escherichia coli. J. Bacteriol. 2005;187:6962–6971. [PubMed] 79. Ussery D., Larsen T.S., Wilkes K.T., Friis C., Worning P., Krogh A., Brunak S. Genome organisation and chromatin structure in Escherichia coli. Biochimie. 2001;83:201–212. [PubMed] 80. Leroy J.L., Kochoyan M., Huynh-Dinh T., Guéron M. Characterization of base-pair opening in deoxynucleotide duplexes using catalyzed exchange of the imino proton. J. Mol. Biol. 1988;200:223–238. [PubMed] 81. Dubendorff J.W., deHaseth P.L., Rosendahl M.S., Caruthers M.H. DNA functional groups required for formation of open complexes between Escherichia coli RNA polymerase and the λ PR promoter. Identification via base analog substitutions. J. Biol. Chem. 1987;262:892–898. [PubMed] 82. Lyakhov I.G., Hengen P.N., Rubens D., Schneider T.D. The P1 phage replication protein RepA contacts an otherwise inaccessible thymine N3 proton by DNA distortion or base flipping. Nucleic Acids Res. 2001;29:4892–4900. [PubMed] 83. Sclavi B., Zaychikov E., Rogozina A., Walther F., Buckle M., Heumann H. Real-time characterization of intermediates in the pathway to open complex formation by Escherichia coli RNA polymerase at the T7A1 promoter. Proc. Natl Acad. Sci. USA. 2005;102:4706–4711. [PubMed] 84. Helmann J.D., deHaseth P.L. Protein–nucleic acid interactions during open complex formation investigated by systematic alteration of the protein and DNA binding partners. Biochemistry. 1999;38:5959–5967. [PubMed] 85. Fenton M.S., Lee S.J., Gralla J.D. Escherichia coli promoter opening and -10 recognition: mutational analysis of σ70. EMBO J. 2000;19:1130–1137. [PubMed] 86. Lim H.M., Lee H.J., Roy S., Adhya S. A ‘master’ in base unpairing during isomerization of a promoter upon RNA polymerase binding. Proc. Natl. Acad. Sci. USA. 2001;98:14849–14852. [PubMed] 87. deHaseth P.L., Tsujikawa L. Probing the role of region 2 of Escherichia coli σ70 in nucleation and maintenance of the single-stranded DNA bubble in RNA polymerase–promoter open complexes. Methods Enzymol. 2003;370:553–567. [PubMed] 88. Roy S., Lim H.M., Liu M., Adhya S. Asynchronous basepair openings in transcription initiation: CRP enhances the rate-limiting step. EMBO J. 2004;23:869–875. [PubMed] 89. Lee H.J., Lim H.M., Adhya S. An unsubstituted C2 hydrogen of adenine is critical and sufficient at the −11 position of a promoter to signal base pair deformation. J. Biol. Chem. 2004;279:16899–16902. [PubMed] 90. Heyduk E., Kuznedelov K., Severinov K., Heyduk T. A consensus adenine at position −11 of the nontemplate strand of bacterial promoter is important for nucleation of promoter melting. J. Biol. Chem. 2006;281:12362–12369. [PubMed] 91. Young B.A., Gruber T.M., Gross C.A. Minimal machinery of RNA polymerase holoenzyme sufficient for promoter melting. Science. 2004;303:1382–1384. [PubMed] 92. Schneider T.D. Evolution of biological information. Nucleic Acids Res. 2000;28:2794–2799. [PubMed] 93. Kawano M., Storz G., Rao B.S., Rosner J.L., Martin R.G. Detection of low-level promoter activity within open reading frame sequences of Escherichia coli. Nucleic Acids Res. 2005;33:6268–6276. [PubMed] 94. Wassarman K.M., Zhang A., Storz G. Small RNAs in Escherichia coli. Trends Microbiol. 1999;7:37–45. [PubMed] 95. Yin Y.W., Steitz T.A. Structural basis for the transition from initiation to elongation transcription in T7 RNA polymerase. Science. 2002;298:1387–1395. [PubMed] 96. Schneider T.D., Stormo G.D. Excess information at bacteriophage T7 genomic promoters detected by a random cloning technique. Nucleic Acids Res. 1989;17:659–674. [PubMed] 97. Stephens R.M., Schneider T.D. Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. J. Mol. Biol. 1992;228:1124–1136. [PubMed] 98. Travers A., Muskhelishvili G. DNA supercoiling—a global transcriptional regulator for enterobacterial growth? Nature Rev. Microbiol. 2005;3:157–169. [PubMed] 99. Chen Y.C., Jeng S.T. Binding affinity of T7 RNA polymerase to its promoter in the supercoiled and linearized DNA templates. Biosci. Biotechnol. Biochem. 2000;64:1126–1132. [PubMed] 100. Darst S.A. Bacterial RNA polymerase. Curr. Opin. Struct. Biol. 2001;11:155–162. [PubMed] 101. Rudd K.E., Schneider T.D. Compilation of E. coli ribosome binding sites. In: Miller J.H., editor. A Short Course in Bacterial Genetics: A Laboratory Manual and Handbook for Escherichia coli and Related Bacteria. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 1992. pp. 17.19–17.45. 102. Hagerman P.J. Flexibility of DNA. Annu. Rev. Biophys. Biophys. Chem. 1988;17:265–286. [PubMed] 103. Halford S.E., Szczelkun M.D. How to get from A to B: strategies for analysing protein motion on DNA. Eur. Biophys. J. 2002;31:257–267. [PubMed] 104. Ringrose L., Chabanis S., Angrand P.O., Woodroofe C., Stewart A.F. Quantitative comparison of DNA looping in vitro and in vivo: chromatin increases effective DNA flexibility at short distances. EMBO J. 1999;18:6630–6641. [PubMed] 105. Collado-Vides J., Magasanik B., Gralla J.D. Control site location and transcriptional regulation in Escherichia coli. Microbiol. Rev. 1991;55:371–394. [PubMed] 106. Bolshoy A., Nevo E. Ecologic genomics of DNA: upstream bending in prokaryotic promoters. Genome Res. 2000;10:1185–1193. [PubMed] 107. Ball C.A., Osuna R., Ferguson K.C., Johnson R.C. Dramatic changes in Fis levels upon nutrient upshift in Escherichia coli. J. Bacteriol. 1992;174:8043–8056. [PubMed] 108. Travers A., Schneider R., Muskhelishvili G. DNA supercoiling and transcription in Escherichia coli: the FIS connection. Biochimie. 2001;83:213–217. [PubMed] 109. Siebenlist U., Simpson R.B., Gilbert W. E. coli RNA polymerase interacts homologously with two different promoters. Cell. 1980;20:269–281. [PubMed] 110. Roberts C.W., Roberts J.W. Base-specific recognition of the nontemplate strand of promoter DNA by E. coli RNA polymerase. Cell. 1996;86:495–501. [PubMed] 111. Althaus E.W., Outten C.E., Olson K.E., Cao H., O'Halloran T.V. The ferric uptake regulation (Fur) repressor is a zinc metalloprotein. Biochemistry. 1999;38:6559–6569. [PubMed] 112. Chen Z., Schneider T.D. Comparative analysis of tandem T7-like promoter containing regions in enterobacterial genomes reveals a novel group of genetic islands. Nucleic Acids Res. 2006;34:1133–1147. [PubMed] 113. Blattner F.R., Plunkett G., III, Bloch C.A., Perna N.T., Burland V., Riley M., Collado-Vides J., Glasner J.D., Rode C.K., Mayhew G.F., et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||
Prog Nucleic Acid Res Mol Biol. 1990; 38():137-64.
[Prog Nucleic Acid Res Mol Biol. 1990]J Bacteriol. 1998 Jun; 180(12):3019-25.
[J Bacteriol. 1998]Nat Rev Microbiol. 2004 Jan; 2(1):57-65.
[Nat Rev Microbiol. 2004]Nature. 1969 Jan 4; 221(5175):43-6.
[Nature. 1969]Cell. 2002 May 17; 109(4):417-20.
[Cell. 2002]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Science. 2002 Aug 30; 297(5586):1562-6.
[Science. 2002]J Biol Chem. 1987 May 5; 262(13):6389-95.
[J Biol Chem. 1987]J Mol Biol. 1993 Jul 20; 232(2):406-18.
[J Mol Biol. 1993]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Nucleic Acids Res. 1982 Feb 11; 10(3):903-12.
[Nucleic Acids Res. 1982]J Mol Biol. 1987 May 5; 195(1):89-97.
[J Mol Biol. 1987]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Proc Natl Acad Sci U S A. 1996 Aug 20; 93(17):8858-62.
[Proc Natl Acad Sci U S A. 1996]Biochim Biophys Acta. 1988 Mar 31; 949(3):311-7.
[Biochim Biophys Acta. 1988]Nature. 2002 Jun 13; 417(6890):712-9.
[Nature. 2002]J Mol Biol. 1987 May 5; 195(1):89-97.
[J Mol Biol. 1987]J Biomol Struct Dyn. 1989 Jun; 6(6):1123-33.
[J Biomol Struct Dyn. 1989]Bioinformatics. 2001 Jul; 17(7):608-21.
[Bioinformatics. 2001]J Mol Biol. 1985 Nov 5; 186(1):117-28.
[J Mol Biol. 1985]Nucleic Acids Res. 1987 Mar 11; 15(5):2343-61.
[Nucleic Acids Res. 1987]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 2001 Jan 1; 29(1):72-4.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 2001 Jan 1; 29(1):277.
[Nucleic Acids Res. 2001]J Theor Biol. 1991 Jan 7; 148(1):125-37.
[J Theor Biol. 1991]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]J Theor Biol. 1991 Jan 7; 148(1):125-37.
[J Theor Biol. 1991]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Pharmacogenetics. 2003 Apr; 13(4):207-18.
[Pharmacogenetics. 2003]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Nucleic Acids Res. 1997 Nov 1; 25(21):4408-15.
[Nucleic Acids Res. 1997]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Nature. 2002 Jun 13; 417(6890):712-9.
[Nature. 2002]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 1990 Oct 25; 18(20):6097-100.
[Nucleic Acids Res. 1990]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Appl Bioinformatics. 2002; 1(3):111-9.
[Appl Bioinformatics. 2002]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]Annu Rev Genet. 1984; 18():173-206.
[Annu Rev Genet. 1984]Annu Rev Genet. 1984; 18():173-206.
[Annu Rev Genet. 1984]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]J Mol Biol. 1999 Oct 22; 293(2):199-213.
[J Mol Biol. 1999]Science. 2002 Aug 30; 297(5586):1562-6.
[Science. 2002]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]Nucleic Acids Res. 2003 Nov 15; 31(22):6663-73.
[Nucleic Acids Res. 2003]J Mol Biol. 2006 Apr 28; 358(2):355-63.
[J Mol Biol. 2006]Nucleic Acids Res. 1990 Oct 25; 18(20):6097-100.
[Nucleic Acids Res. 1990]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]J Biol Chem. 1987 May 5; 262(13):6389-95.
[J Biol Chem. 1987]Nucleic Acids Res. 2000 Jan 1; 28(1):60-4.
[Nucleic Acids Res. 2000]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]Nucleic Acids Res. 1997 Nov 1; 25(21):4408-15.
[Nucleic Acids Res. 1997]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 1997 Dec 15; 25(24):4994-5002.
[Nucleic Acids Res. 1997]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 1990 Sep 11; 18(17):4993-5000.
[Nucleic Acids Res. 1990]J Mol Biol. 1998 Nov 27; 284(2):241-54.
[J Mol Biol. 1998]Nucleic Acids Res. 2000 Jan 1; 28(1):60-4.
[Nucleic Acids Res. 2000]Nucleic Acids Res. 2001 Jan 1; 29(1):72-4.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 2001 Jan 1; 29(1):277.
[Nucleic Acids Res. 2001]J Mol Biol. 1990 May 5; 213(1):37-52.
[J Mol Biol. 1990]Microbiol Mol Biol Rev. 2003 Mar; 67(1):86-156, table of contents.
[Microbiol Mol Biol Rev. 2003]J Mol Biol. 1993 Sep 20; 233(2):219-30.
[J Mol Biol. 1993]Proc Natl Acad Sci U S A. 1976 Mar; 73(3):804-8.
[Proc Natl Acad Sci U S A. 1976]J Mol Biol. 2005 Jul 29; 350(5):930-7.
[J Mol Biol. 2005]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Prog Nucleic Acid Res Mol Biol. 1990; 38():137-64.
[Prog Nucleic Acid Res Mol Biol. 1990]Annu Rev Genet. 1984; 18():173-206.
[Annu Rev Genet. 1984]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Proc Natl Acad Sci U S A. 1996 Aug 20; 93(17):8858-62.
[Proc Natl Acad Sci U S A. 1996]Nature. 2002 Jun 13; 417(6890):712-9.
[Nature. 2002]J Mol Biol. 1989 May 20; 207(2):301-10.
[J Mol Biol. 1989]Biochim Biophys Acta. 1988 Mar 31; 949(3):311-7.
[Biochim Biophys Acta. 1988]Biochemistry. 2003 Sep 16; 42(36):10718-25.
[Biochemistry. 2003]Nucleic Acids Res. 2003 Apr 1; 31(7):1813-20.
[Nucleic Acids Res. 2003]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]J Biol Chem. 1982 Dec 10; 257(23):13924-9.
[J Biol Chem. 1982]Nucleic Acids Res. 1988 Aug 11; 16(15):7673-83.
[Nucleic Acids Res. 1988]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Annu Rev Genet. 1984; 18():173-206.
[Annu Rev Genet. 1984]J Biol Chem. 1987 May 5; 262(13):6389-95.
[J Biol Chem. 1987]EMBO J. 1997 Jul 1; 16(13):4034-40.
[EMBO J. 1997]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 2000 Jan 1; 28(1):60-4.
[Nucleic Acids Res. 2000]J Mol Biol. 2003 Oct 17; 333(2):261-78.
[J Mol Biol. 2003]Mol Microbiol. 1994 Mar; 11(5):943-54.
[Mol Microbiol. 1994]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]Nucleic Acids Res. 1997 Nov 1; 25(21):4408-15.
[Nucleic Acids Res. 1997]Proc Natl Acad Sci U S A. 1983 Sep; 80(17):5235-9.
[Proc Natl Acad Sci U S A. 1983]Nucleic Acids Res. 1999 Feb 1; 27(3):882-7.
[Nucleic Acids Res. 1999]Mol Microbiol. 1999 Apr; 32(1):29-40.
[Mol Microbiol. 1999]Mol Gen Genet. 1998 May; 258(4):442-7.
[Mol Gen Genet. 1998]J Bacteriol. 1996 Dec; 178(24):7234-40.
[J Bacteriol. 1996]Mol Gen Genet. 1998 May; 258(4):442-7.
[Mol Gen Genet. 1998]J Mol Biol. 2003 Oct 17; 333(2):261-78.
[J Mol Biol. 2003]J Bacteriol. 1998 Jun; 180(12):3019-25.
[J Bacteriol. 1998]Nat Rev Microbiol. 2004 Jan; 2(1):57-65.
[Nat Rev Microbiol. 2004]Nature. 1969 Jan 4; 221(5175):43-6.
[Nature. 1969]Science. 2002 May 17; 296(5571):1285-90.
[Science. 2002]J Mol Biol. 1997 Jul 11; 270(2):152-68.
[J Mol Biol. 1997]Science. 2002 Aug 30; 297(5586):1562-6.
[Science. 2002]J Bacteriol. 1995 Aug; 177(16):4628-37.
[J Bacteriol. 1995]J Bacteriol. 2005 Oct; 187(20):6962-71.
[J Bacteriol. 2005]Nucleic Acids Res. 1997 Dec 15; 25(24):4994-5002.
[Nucleic Acids Res. 1997]J Mol Biol. 1998 Nov 27; 284(2):241-54.
[J Mol Biol. 1998]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 1990 Sep 11; 18(17):4993-5000.
[Nucleic Acids Res. 1990]Biochimie. 2001 Feb; 83(2):201-12.
[Biochimie. 2001]J Mol Biol. 1998 Nov 27; 284(2):241-54.
[J Mol Biol. 1998]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]J Mol Biol. 1993 Sep 20; 233(2):219-30.
[J Mol Biol. 1993]Mol Cell. 2002 Mar; 9(3):527-39.
[Mol Cell. 2002]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]J Mol Biol. 1988 Mar 20; 200(2):223-38.
[J Mol Biol. 1988]J Biol Chem. 1987 Jan 15; 262(2):892-8.
[J Biol Chem. 1987]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]J Mol Biol. 1993 Sep 20; 233(2):219-30.
[J Mol Biol. 1993]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 2001 Dec 1; 29(23):4892-900.
[Nucleic Acids Res. 2001]Proc Natl Acad Sci U S A. 2005 Mar 29; 102(13):4706-11.
[Proc Natl Acad Sci U S A. 2005]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Biochemistry. 1999 May 11; 38(19):5959-67.
[Biochemistry. 1999]J Biol Chem. 2006 May 5; 281(18):12362-9.
[J Biol Chem. 2006]Science. 2004 Feb 27; 303(5662):1382-4.
[Science. 2004]Mol Cell. 2002 Mar; 9(3):527-39.
[Mol Cell. 2002]J Mol Biol. 1990 May 5; 213(1):37-52.
[J Mol Biol. 1990]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Nucleic Acids Res. 2000 Jul 15; 28(14):2794-9.
[Nucleic Acids Res. 2000]Science. 2002 May 17; 296(5571):1285-90.
[Science. 2002]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Nucleic Acids Res. 1982 Feb 11; 10(3):903-12.
[Nucleic Acids Res. 1982]Nucleic Acids Res. 1983 Sep 10; 11(17):5855-64.
[Nucleic Acids Res. 1983]Biochim Biophys Acta. 1988 Mar 31; 949(3):311-7.
[Biochim Biophys Acta. 1988]J Mol Biol. 2003 Oct 17; 333(2):261-78.
[J Mol Biol. 2003]Nucleic Acids Res. 2005; 33(19):6268-76.
[Nucleic Acids Res. 2005]Nucleic Acids Res. 2005; 33(19):6268-76.
[Nucleic Acids Res. 2005]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Nucleic Acids Res. 2000 Jul 15; 28(14):2794-9.
[Nucleic Acids Res. 2000]J Biol Chem. 1987 May 5; 262(13):6389-95.
[J Biol Chem. 1987]EMBO J. 1997 Jul 1; 16(13):4034-40.
[EMBO J. 1997]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Nucleic Acids Res. 1982 Feb 11; 10(3):903-12.
[Nucleic Acids Res. 1982]Nucleic Acids Res. 1983 Sep 10; 11(17):5855-64.
[Nucleic Acids Res. 1983]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Nucleic Acids Res. 2000 Jul 15; 28(14):2794-9.
[Nucleic Acids Res. 2000]J Theor Biol. 1991 Jan 7; 148(1):125-37.
[J Theor Biol. 1991]Science. 2002 Nov 15; 298(5597):1387-95.
[Science. 2002]Nucleic Acids Res. 1989 Jan 25; 17(2):659-74.
[Nucleic Acids Res. 1989]J Mol Biol. 1992 Dec 20; 228(4):1124-36.
[J Mol Biol. 1992]Biochim Biophys Acta. 1988 Mar 31; 949(3):311-7.
[Biochim Biophys Acta. 1988]Genome Biol. 2004; 5(11):R87.
[Genome Biol. 2004]Nat Rev Microbiol. 2005 Feb; 3(2):157-69.
[Nat Rev Microbiol. 2005]Biosci Biotechnol Biochem. 2000 Jun; 64(6):1126-32.
[Biosci Biotechnol Biochem. 2000]Annu Rev Biochem. 1985; 54():171-204.
[Annu Rev Biochem. 1985]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]EMBO J. 1997 Jul 1; 16(13):4034-40.
[EMBO J. 1997]Nature. 2002 Jun 13; 417(6890):712-9.
[Nature. 2002]Prog Nucleic Acid Res Mol Biol. 1990; 38():137-64.
[Prog Nucleic Acid Res Mol Biol. 1990]Microbiol Mol Biol Rev. 2003 Mar; 67(1):86-156, table of contents.
[Microbiol Mol Biol Rev. 2003]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Science. 2004 Feb 27; 303(5662):1382-4.
[Science. 2004]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]J Mol Biol. 2003 Oct 17; 333(2):261-78.
[J Mol Biol. 2003]Nucleic Acids Res. 2000 Jan 1; 28(1):60-4.
[Nucleic Acids Res. 2000]Nature. 2002 Jun 13; 417(6890):712-9.
[Nature. 2002]Curr Opin Struct Biol. 2001 Apr; 11(2):155-62.
[Curr Opin Struct Biol. 2001]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]J Mol Biol. 1998 Nov 27; 284(2):241-54.
[J Mol Biol. 1998]Science. 2002 Aug 30; 297(5586):1562-6.
[Science. 2002]J Mol Biol. 1999 Oct 22; 293(2):199-213.
[J Mol Biol. 1999]J Mol Biol. 1999 Oct 22; 293(2):199-213.
[J Mol Biol. 1999]Annu Rev Biophys Biophys Chem. 1988; 17():265-86.
[Annu Rev Biophys Biophys Chem. 1988]Eur Biophys J. 2002 Jul; 31(4):257-67.
[Eur Biophys J. 2002]EMBO J. 1999 Dec 1; 18(23):6630-41.
[EMBO J. 1999]Microbiol Rev. 1991 Sep; 55(3):371-94.
[Microbiol Rev. 1991]J Bacteriol. 1992 Dec; 174(24):8043-56.
[J Bacteriol. 1992]Nucleic Acids Res. 1997 Dec 15; 25(24):4994-5002.
[Nucleic Acids Res. 1997]Biochimie. 2001 Feb; 83(2):201-12.
[Biochimie. 2001]Biochimie. 2001 Feb; 83(2):213-7.
[Biochimie. 2001]Microbiol Rev. 1991 Sep; 55(3):371-94.
[Microbiol Rev. 1991]Cell. 1980 Jun; 20(2):269-81.
[Cell. 1980]Cell. 1996 Aug 9; 86(3):495-501.
[Cell. 1996]Nucleic Acids Res. 2000 Jan 1; 28(1):60-4.
[Nucleic Acids Res. 2000]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]Mol Microbiol. 1994 Mar; 11(5):943-54.
[Mol Microbiol. 1994]Biochemistry. 1999 May 18; 38(20):6559-69.
[Biochemistry. 1999]Nucleic Acids Res. 1997 Nov 1; 25(21):4408-15.
[Nucleic Acids Res. 1997]Nucleic Acids Res. 2006; 34(4):1133-47.
[Nucleic Acids Res. 2006]Mol Microbiol. 1999 Apr; 32(1):29-40.
[Mol Microbiol. 1999]Mol Gen Genet. 1998 May; 258(4):442-7.
[Mol Gen Genet. 1998]J Mol Biol. 1997 Jul 11; 270(2):152-68.
[J Mol Biol. 1997]Science. 1997 Sep 5; 277(5331):1453-62.
[Science. 1997]Science. 1997 Sep 5; 277(5331):1453-62.
[Science. 1997]