![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||
Copyright Shultzaberger et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Determining Physical Constraints in Transcriptional Initiation Complexes Using DNA Sequence Analysis 1Department of Molecular and Cell Biology, University of California, Berkeley, United States of America 2Graduate Group in Biophysics, University of California, Berkeley, United States of America 3Department of Genome Sciences, Genomic Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, California, United States of America Abraham Sonenshein, Academic Editor Tufts University, United States of America * To whom correspondence should be addressed. E-mail: mbeisen/at/lbl.gov Conceived and designed the experiments: DC AM RS. Performed the experiments: RS. Analyzed the data: DC RS. Contributed reagents/materials/analysis tools: RS. Wrote the paper: ME RS. Received March 13, 2007; Accepted October 29, 2007. This article has been cited by other articles in PMC.Abstract Eukaryotic gene expression is often under the control of cooperatively acting transcription factors whose binding is limited by structural constraints. By determining these structural constraints, we can understand the “rules” that define functional cooperativity. Conversely, by understanding the rules of binding, we can infer structural characteristics. We have developed an information theory based method for approximating the physical limitations of cooperative interactions by comparing sequence analysis to microarray expression data. When applied to the coordinated binding of the sulfur amino acid regulatory protein Met4 by Cbf1 and Met31, we were able to create a combinatorial model that can correctly identify Met4 regulated genes. Interestingly, we found that the major determinant of Met4 regulation was the sum of the strength of the Cbf1 and Met31 binding sites and that the energetic costs associated with spacing appeared to be minimal. Introduction The regulation of transcriptional initiation from individual eukaryotic promoters is often controlled by multiple cooperatively interacting transcription factors. These factors bind to separate sites in cis-regulatory sequences and physically interact with each other, either directly or through additional proteins, to activate or repress transcription [1], [2], [3]. These physical interactions among transcription factors must constrain how their binding sites can be positioned relative to each other and to the relevant promoters. Yet, there is often considerable variability in the order, orientation and spacing of binding sites for interacting transcription factors [4], [5], [6]. Understanding how the arrangement of sites is related to the stability of these complexes and their regulatory activity is essential if we are to understand the regulatory content of eukaryotic genomes. To successfully model the binding of multi-meric complexes to different target sequences, many energetic contributions need to be considered. The affinity of each transcription factor for DNA varies considerably with the precise bound sequence, even among known in vivo targets [7], [8]. The stability of the entire complex is also dependent on how compatible the positioning of the sites are with the protein-protein interactions necessary to form the complex. Poorly positioned sites presumably introduce clashes or strain into either the complex or DNA which will, in turn, reduce the stability of the complex. Here, we combine DNA sequence analysis and genome-wide expression data to discern the constraints on the arrangement of binding sites for transcription factors involved in regulating the synthesis of sulfur-containing amino acids in the yeast Saccharomyces cerevisiae. This work builds on our previous modeling of bipartite prokaryotic ribosome and σ70 binding sites [9], [10]. In both of these cases, initiation requires the cooperative binding of two independent components separated by a variable spacer, the Shine-Dalgarno and P site for ribosome binding sites, and the −10 and −35 for σ70 binding sites [11], [12], [13], [14]. Since there were a large number of characterized sites for these systems, we constructed a robust distribution of the allowable spacings between binding components. Assuming that the spacing that would induce the least amount of strain in the protein or in the bound DNA upon binding would be the most commonly observed, and that the frequency of occurrence of all other spacings would be directly related to the energetic consequence of using that spacing, we could model the energetic contribution of different spacings to the formation of a stable initiation complex. Cooperatively acting transcription factors in eukaryotes are similar to the prokaryotic ribosome and σ70 in that they have independent binding components separated by variable spacers, but they are different in that the components are not physically linked upon binding and therefore can bind in different orders, orientations, and with greater variability in their spacing. We have devised a method to determine these additional physical constraints by optimizing an information theory based model against microarray data. We can use these optimized constraints to not only infer structural characteristics of the regulatory complex, but also to quantify the binding of these multi-meric complexes to different DNA sequences, and to accurately predict target genes. Met4 is the major transcriptional activator of sulfur utilization genes in Saccharomyces cerevisiae even though it does not bind directly to DNA [15], [5]. Met4 stabilization is dependent upon at least two additional proteins. One of these is the centromere-binding factor (Cbf1) [15], whose DNA binding activity is stimulated by association with Met28 [16]. It has been suggested that the Cbf1-Met28-Met4 complex may be sufficient for activation of some genes, but coordination by a second factor is necessary for others [4]. We are interested in describing this coordinated system. The second stabilizing factor that we will study is Met31, a factor unique to sulfur regulation [17]. Neither the distance between Cbf1 and Met31 in functional Met4 stabilizing complexes, nor the distance between Met4 and the initiating polymerase is fixed [5]. We extended the information theory-based method we used to study prokaryotic translational and transcriptional initiation to model Cbf1 and Met31 interactions, allowing for the greater flexibility present in this system. Materials and Methods Cbf1 and Met31 binding models We built a weight matrix describing the sequence preferences of Cbf1 from 16 Cbf1 binding sites characterized by Wieland et al.[18]. Binding matrices were built using the standard Delila programs [19], [20]. Since Cbf1 binds as a homodimer, we used each sequence and its complement to build our model [21] (Fig. 1A
Searching algorithm Multi-component binding systems with variable spacing between components have previously been modeled [9], [10]. In the case of the prokaryotic ribosome and σ70, the binding components are physically connected. In both instances, deviations in the optimal spacing between components introduces strain in the bound complex and affects the binding energy [11], [12], [13], [14]. To model these multi-meric binders the following equation was used:
= -log2(1/(dmax–dmin+1)), where dmin is the shortest spacing between Met31 and Cbf1, and dmax is the longest spacing. The distance between Met31 and Cbf1 is calculated between the zero positions of the binding components as with previous flexible models.For the ribosome and the polymerase, the binding components are physically linked and can only bind in one orientation relative to each other. For cooperatively acting transcription factors though, there could be variation in the orientation of the sites relative to each other. To account for this, we can adapt the gap surprisal function to:
= 1, and the total number of occurrences is n = 2. The orientation surprisal for this system would therefore be 1 bit of information. In a system where there is no variability in orientation, the frequency of occurrence at that orientation would be n(o)/n = 1, and therefore the orientation surprisal would be 0 bits. The advantage of the OS(o) calculation is that we can model the subtle energetic differences for systems that allow either orientation, but favor one over the other.To calculate the total information for Met4 coordination, we can now expand equation (1) to:
= 0 bits. Therefore, the orientation surprisal only applies to asymmetric binders.Combinatorial scans were done using multiscan [10] to identify and quantify Cbf1/Met31 cooperatively acting binding sites in the genome. The individual information contribution for both sites (Ri(Cbf1) and Ri(Met31)) were calculated over the range −4 to +5, since this is the range of conservation for both logos (Fig. 1 All genes in the genome were then ranked based on the strength of their strongest upstream site. Microarray expression data for sulfur amino acid pathway-affected cells (see Microarray Datasets) were then averaged for the top 30 genes in our ranking. All values averaged were log2 of the expression fold change between affected and unaffected cells. This was done independently for induction and repression experiments. The physical constraints that we want to define are: the ordering of the sites relative to the gene start, the orientation of the matrices, the maximum allowed distance between Met4 and the polymerase binding site, and the spacing range between Cbf1 and Met31 that can bind Met4. We varied these constraints, and iteratively refined the model to get the optimal predictor. We evaluated any given set of parameters by calculating the average expression change in the top 30 ranked genes. The greater the expression change the better the model. Another approach could be to cluster genes based on similar trends in expression data across several experiments, and then try to train our parameters based on this set of genes. One disadvantage of this is that it is difficult to discern directly from indirectly regulated genes in these clusters. By scanning the genome and ranking the genes, we are selecting only for genes that are directly regulated. Also this approach does not exclude genes that are regulated but had anomalous expression data due to experimental error. Since there have been at least 20 genes implicated in sulfur assimilation [5], we chose to average the top 30 gene expression differences to evaluate our model. We chose 30 so that we would not overfit our model by looking at too few genes, and not introduce noise into our analysis by averaging too many. Microarray Datasets We used microarray data from two sources for our analysis. Gasch et al. [26] reported amino acid starvation data, where transcription of Met4 regulated genes was induced. Fauchon et al. [27] reported Cd2+ addition experiments where Met4 regulated genes were induced, and Met4 deletion experiments where Met4 regulated genes were repressed. Our models were optimized against these data as mentioned above. Microarray expression patterns were visualized using TreeView [28]. The yeast genome sequence and annotation that we used in our analysis came from Genbank accession numbers NC001133 to NC001148. Results Cbf1 and Met31 logos Since Cbf1 is a homodimeric protein, we used all sequences and their complements to build our model [21]. Conservation at positions −2,−1 and +2,+3 is strong and does not match the helical accessibility wave (Fig. 1A The Met31 model was built as described in Materials and Methods (Fig. 1B Orientation and ordering Since Cbf1 and Met31 are not physically linked upon binding, it was not immediately obvious what the ordering and orientation constraints on their binding are in functional Met4 docking complexes. To determine this, we tested the predictive capabilities of all combinations of orientation and ordering for Cbf1 and Met31 using the gene-ranking approach described in Materials and Methods. Briefly, we determined the flexible information for the cooperative model as determined by equation (4) [9], [10], and ranked all genes in the genome based on the strength of the strongest site in the intergenic region immediately upstream of their starts. We then calculated the average expression fold change of the top 30 genes in this ranking based on Met4 induced and repressed microarray experiments [26], [27]. We regarded those combinations that gave the highest average microarray expression change to be the optimal organization for Met4 coordination. Fig. 2
Cbf1 alone was not sufficient to identify the Met4 regulated genes. The average expression fold change for the top 30 ranked genes was 0.09 and 0.16 for induction and repression data respectively, we report corresponding values for all other combinations. Met31 alone appeared to be a better predictor than Cbf1, but was still weak (0.44 and −0.90). This improvement of Met31 prediction over Cbf1 is expected since Cbf1 has a regulatory role outside of Met4 binding [31]. For both Cbf1 and Met31, we only considered binding sites within 1000 bases upstream of the closest gene start. By searching for Cbf1 and Met31 sites together, with a maximum spacing of 100 bases between the zero positions of the binding components (Fig. 1 Since Cbf1 is a homodimer, its binding is independent of orientation. Since Met31 is monomeric, its binding is orientation dependent. When we allowed for both orientations of Met31 downstream of Cbf1, we got the largest change of expression (0.95 and −1.57). This suggested that transcriptional activation by Met4 requires a Met31 site with any orientation to fall between Cbf1 and the gene start (bottom right panel of Fig. 2 To test whether the average expression values that we observed are statistically significant, we randomly chose 10,000 sets of 30 genes from the genome and averaged their expression change values. We did this for both the induced and repressed data sets. Both sets gave similar normal distributions with a mean of −0.015 and SD of 0.11 for the induced data set and a mean of 0.001 and SD of 0.094 for the repressed data set. For the best organization of sites in Figure 2 All models for the remainder of this analysis will have these ordering and orientation requirements imposed on them. The designation of the Met31 model orientation as “normal” or “inverted” is arbitrary. We also tested the “inverted” Met31 model alone, and inverted Met31 upstream of Cbf1, but the results were similar to equivalent scans with the “normal” orientation (data not shown). Spacing constraints There are two spacing constraints on this system, the distance between the Met4 docking complex and the initiating polymerase, and the distance between the two binding components (Cbf1 and Met31) within the Met4 docking complex. To define what these spacing ranges are for functional Met4 binding sites, we systematically modeled different spacing ranges, and quantified the models by the gene-ranking approach previously described. Interestingly, if we varied one of the spacing constraints, the optimal spacing for the other would differ slightly. To identify which spacing parameters define the optimal predictor, we varied both spacings simultaneously, and quantified their predictability by averaging the expression change of their 30 highest ranking genes. We increased the maximum allowed distance of the Met4 docking complex from the gene start in 50 base increments as measured by the distance between the Met31 site and the translational initiation codon. At each 50 bp increment, we varied the minimum and maximum allowed distance between Cbf1 and Met31 from 1 to 100 bases. These distances are relative to the zero position of both matrices (Fig. 1 For the first spacing constraint, the distance between Met4 and the polymerase, we found the optimal maximum spacing was 450 bases (Fig. 3
For the second spacing constraint, the distance between Cbf1 and Met31, we found the optimal spacing range to be −9 to −68 bases, the minimum to maximum spacing allowed between each site (Fig. 4
Optimal model Based on the analysis in Fig. 2
To test whether we were over- or under-fitting our model, we repeated the optimization by averaging the top 20 and 40 genes instead of 30. When we did this our results differed slightly. By averaging the top 20 genes, we found the optimal spacing range shrunk slightly to 13 to 68 bases between Cbf1 and Met31 and the optimal maximum distance between Met31 and the gene start remained at 450 bases. At this smaller range, Gsh1 and Sam1 were lost, both which have been implicated in sulfur assimilation. By averaging the top 40 genes, the spacing range remained the same (9 to 68 bases), but the maximum spacing range expanded to 750 bases. No new genes with the expected expression pattern or functional evidence for Met4 regulation were identified at this larger spacing. The larger the number of genes that are averaged, the more likely random genes with above average expression differences will be in the averaged set, which will obscure the actual parameters. Thirty genes appeared to be the appropriate number to average, and the relatively consistent parameters observed from averaging 20, 30, or 40 genes suggested that these parameters are reliable. To test whether there is a tendency for Cbf1 and Met31 to bind on the same face of the DNA, we plotted the relative spacing between the two sites on a cosine wave with the same period as B-form DNA, 10.6 bases (Fig. 5 Sam1 has the shortest Cbf1-Met31 spacing of 9 bases. This spacing is completely on the opposite face of the DNA according to a phasing of −13.86 bases. One would expect spacing to be more constrained at shorter distances, since it is more difficult for DNA to bend. We looked at the region upstream of the Sam1 translational start to see if there was another strong Cbf1-Met31 pair with a larger spacing. We found a second weaker Cbf1 site 36 bases upstream of the Met31 site, giving a total flexible information of 12.6 bits. This spacing would suggest same face binding according to a phasing of −13.86 bases. The strong Cbf1 site 9 bases upstream of the Met31 site may have occurred randomly, or may be part of an overlapping second site. To calculate the flexible individual information for each binding site, we used equation (4). Since we did not know the energetic effect of different spacings on the complex initially, we treated all spacings equally. That is, over the range 9 to 68 (60 bases of variability) all positions had the same gap surprisal of GS(d) = -log2(1/60) = 5.91 bits according to equation (2). We also assumed an equiprobable occurrence of each orientation of Met31, so that OSMet31(o) = -log2(1/2) = 1 bit of information according to equation (3). Therefore the GS and OS variables in equation (4) effectively become constants summing to 6.91 bits of uncertainty for each site. Because of the small number of target genes, and the already strong predictive capabilities of our model, we cannot determine the individual spacing constraints for this system. If we had a system with more sites, robust spacing and orientation distributions could be determined and individual penalties could be assigned.We can use these values to predict the Rsequence or average information content for this system which is:
S(d) is the mean GS(d) value for all sites, and ŌS(o) is the mean OS(o) value. According to this equation Rsequence(Met4) = 12.9+11.9−5.91−1.0 = 17.9 bits of information.For each gene we plotted the strength of its strongest upstream Met4 binding site according to the model in Fig. 5
Discussion Transcriptional initiation in eukaryotes is often regulated by multiple cooperatively acting factors. Often these factors can only positively affect transcription if they physically interact either directly or indirectly through additional proteins with the basal transcriptional machinery. Understanding the physical constraints that determine functional cooperativity is essential for us to be able to model, predict, and engineer genetic control systems. These constraints generally are not rigid, but allow for variability in the arrangement of sites in functional complexes and subsequently there is variability in the stability of the complexes. Here, we have introduced a way to include orientation and order into the information theoretic description of pattern recognition at the promoter. This combined with weight matrix based binding models [20] and spacing constraints [9], [10] gives us quantitative tools to model the sequence basis of eukaryotic transcriptional regulation. The simplest constraint of Met4 coordination to define is the ordering of the sites within the complex. For Met4, our model matches microarray data poorly when the order is Met31-Cbf1-gene start, but matches well with the order Cbf1-Met31-gene start (Fig. 2 It appears as though either orientation of Met31 can be used within the docking complex (Fig. 2 The maximum distance between the Met4 docking site, as measured to the zero coordinate of the Met31 site, and the gene start is 450 bases. As the TFIID binding site is not at the gene start, this distance is farther than the maximum allowed distance between Met4 and the polymerase. It is difficult to determine the distance of the Met4 docking complex to the transcriptional start since the starts have not been biochemically proven, and computationally it is difficult to predict transcription initiation because of the varied modes of initiation by the polymerase [2]. Basehoar et al. found an enrichment of TATAs between 50 and 200 bases upstream of the translational start [33]. This could explain why we did not observe any sites within 100 bases of the gene start. The spacing range between the Cbf1 and Met31 site is 9 to 68 bases, as determined in Fig. 4 A minimum spacing of 12 would be consistent with our observed optimal helical phasing of -13.86 bases (Fig. 5 The experimentally determined range by Chiang et al. was 21 to 53 bases according to our numbering system [32]. Unfortunately, spacings as large as 68 bases were not tested experimentally. The experimentally determined minimum spacing of 21 is much larger than the minimum we found here. Interestingly, only the “inverted” orientation of Met31 was tested, whereas the shortest distance in this paper corresponds to a Met31 site with the opposite orientation. If helical phasing of the sites is important, then the orientation of Met31 may be more constrained at shorter distances, and this may account for the disparity between the experimentally and computationally determined minimums. When the constraints inferred from our analysis were imposed on the cooperative binding of Cbf1 and Met31, our ability to predict Met4 regulation was high. Of the top 23 ranked genes in S. cerevisiae (according to our model), 19 had an expected microarray expression pattern for Met4 regulation. Many of the sites had also been previously characterized as sulfur utilization genes (Fig. 6 When the microarray data from experiments that affected Met4 binding were directly compared to our information evaluation of each gene (Fig. 7 We cannot determine the individual spacing effect on binding because we have so few sites covering a large spacing range. If we did have these individual effects, we would expect to see a slight improvement in our ability to predict expression. The fact that we did so well in predicting Met4 sites by just taking into account the strength of the sites is surprising to us. In bacteria, transcription factors interact at shorter distances, and the effect of spacing on stability is greater (since it is more difficult to bend a short piece of DNA than a large one) [10]. For coordinated binding of Met4, the summed affinities of Cbf1 and Met31 appeared to dominate the stability of the Met4 complex, suggesting that the energetic penalties associated with spacing are considerably less for this system. Experimental testing of multiple spacings between Cbf1 and Met31 suggested that spacing had a little effect on Met4 regulation [32]. A decrease in the effect of spacing on stability could be due to increasing the flexibility between the activation domain and the DNA-binding domain of the transcription factors, by increasing the distance between functioning cooperatively acting binding sites, or by increasing flexibility in the coordinating protein. We could have determined these physical constraints by clustering co-regulated genes and training the rules of binding for their regulators. The drawback of this approach is that it is not obvious which genes are directly and indirectly regulated, and a given gene may or may not have a binding site. Our approach selects only for genes that are directly regulated, and does not exclude sites that have poor expression data due to experimental error. We are also optimizing our model against all genes in the genome, so we are selecting for a model that represents Met4 binding well, in that it can identify a small subset of sites from all sites in the genome. Presumably the optimal binding site, based on the flexible information theory approach, is the most stable site and the easiest to crystalize. These results could be used to guide crystallographic experiments. The information content of a given DNA-binding protein (Rsequence) is a function of the variability within its binding targets [24]. A more stringent binder would have a higher information content, since the variability in its binding targets would be smaller. To be able to distinguish γ binding sites within a random DNA of some length G, those sites must have an Rfrequency = -log2(γ/G) bits of information to be identified [24]. It has been shown for many systems that Rsequence converges to Rfrequency [24], [35]. This suggests that if the size of the genome increases and the number of binding sites remains constant, the information of those sites would have to increase in order to be distinguishable.As eukaryotic genomes are generally larger than prokaryotic genomes, the amount of information needed to identify γ sites would have to be greater. This can be achieved either by increasing the information content of a single factor, or by using multiple factors combinatorially. Assuming no individual spacing or orientation preferences, the information for this system would be 12.9+11.9−5.9−1 = 17.9 bits according to equation (4). This would correspond to 1 site every 217.9 = 2.45×105 bases, or about 104 times in the S. cerevisiae genome of length 12.8 MB. Our calculation is the number of sites in 2× the genome length, since the complex could associate with either strand. This is a reasonable number of genes according to known sulfur assimilation genes (>20 genes) [5], the number of predicted regulated genes based on expression difference due to Cd2+ treatment (66 genes) [27], and multiple sites per gene as seen in several cases. This suggests that like single acting transcription factors, the information contained in combinatorial binders is related to the Rfrequency for that system [24]. Others have suggested that this relationship will be maintained for cooperatively acting factors [36]. Interestingly, this information is distributed through individual binding components, as well as the spacing between components, and if one component changed, the others would have to compensate accordingly. This is a complicated process, since Cbf1 can also function independently of Met4 and Met31 [31].Acknowledgments We would like to thank Tom Schneider, Richard Lusk, James Fraser and Virgil Rhodius for comments and useful discussion. Footnotes Competing Interests: The authors have declared that no competing interests exist. Funding: This work was supported by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 and NIH grant #RHG002779A. References 1. Levine M, Tjian R. Transcription regulation and animal diversity. Nature. 2003;424:147–151. [PubMed] 2. Lee TI, Young RA. Transcription of eukaryotic protein-coding genes. Annu Rev Genet. 2000;34:77–137. [PubMed] 3. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002;298:799–804. [PubMed] 4. Blaiseau PL, Thomas D. Multiple transcriptional activation complexes tether the yeast activator Met4 to DNA. EMBO J. 1998;17:6327–6336. [PubMed] 5. Thomas D, Surdin-Kerjan Y. Metabolism of sulfur amino acids in Saccharomyces cerevisiae. Microbiol Mol Biol Rev. 1997;61:503–532. [PubMed] 6. Chiang DY, Moses AM, Kellis M, Lander ES, Eisen MB. Phylogenetically and spatially conserved word pairs associated with gene-expression changes in yeasts. Genome Biol. 2003;4:R43. [PubMed] 7. Shultzaberger RK, Roberts LR, Lyakhov IG, Sidorov IA, Stephen AG, et al. Correlation between binding rate constants and individual information of E. coli Fis binding sites. Nucleic Acids Res. 2007;35:5275–5283. http://www.ccrnp.ncifcrf.gov/toms/paper/fisbc/, http://dx.doi.org/10.1093/nar/gkm471. [PubMed] 8. Udalova IA, Mott R, Field D, Kwiatkowski D. Quantitative prediction of NF-κB DNA-protein interactions. Proc Natl Acad Sci USA. 2002;99:8167–8172. [PubMed] 9. Shultzaberger RK, Bucheimer RE, Rudd KE, Schneider TD. Anatomy of Escherichia coli Ribosome Binding Sites. J Mol Biol. 2001;313:215–228. http://www.ccrnp.ncifcrf.gov/toms/paper/flexrbs/ [PubMed] 10. Shultzaberger RK, Chen Z, Lewis KA, Schneider TD. Anatomy of Escherichia coli σ70 promoters. Nucleic Acids Res. 2007;35:771–788. http://www.ccrnp.ncifcrf.gov/toms/paper/flexprom/ [PubMed] 11. Chen H, Bjerknes M, Kumar R, Jay E. Determination of the optimal aligned spacing between the Shine-Dalgarno sequence and the translation initiation codon of Escherichia coli mRNAs. Nucleic Acids Res. 1994;22:4953–4957. [PubMed] 12. Rinke-Appel J, Junke N, Brimacombe R, Lavrik I, Dokudovskaya S, et al. Contacts between 16S ribosomal RNA and mRNA, within the spacer region separating the AUG initiator codon and the Shine-Dalgarno sequence; a site-directed cross-linking study. Nucleic Acids Res. 1994;22:3018–3025. [PubMed] 13. Hawley DK, McClure WR. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 1983;11:2237–2255. [PubMed] 14. McClure WR. Mechanism and control of transcription initiation in prokaryotes. Annu Rev Biochem. 1985;54:171–204. [PubMed] 15. Thomas D, Jacquemin I, Surdin-Kerjan Y. MET4, a leucine zipper protein, and centromere-binding factor 1 are both required for transcriptional activation of sulfur metabolism in Saccharomyces cerevisiae. Mol Cell Biol. 1992;12:1719–1727. [PubMed] 16. Kuras L, Barbey R, Thomas D. Assembly of a bZIP-bHLH transcription activation complex: formation of the yeast Cbf1-Met4-Met28 complex is regulated through Met28 stimulation of Cbf1 DNA binding. EMBO J. 1997;16:2441–2451. [PubMed] 17. Blaiseau PL, Isnard AD, Surdin-Kerjan Y, Thomas D. Met31p and Met32p, two related zinc finger proteins, are involved in transcriptional regulation of yeast sulfur amino acid metabolism. Mol Cell Biol. 1997;17:3640–3648. [PubMed] 18. Wieland G, Hemmerich P, Koch M, Stoyan T, Hegemann J, et al. Determination of the binding constants of the centromere protein Cbf1 to all 16 centromere DNAs of Saccharomyces cerevisiae. Nucleic Acids Res. 2001;29:1054–1060. [PubMed] 19. Schneider TD, Stephens RM. Sequence logos: A new way to display consensus sequences. Nucleic Acids Res. 1990;18:6097–6100. http://www.ccrnp.ncifcrf.gov/toms/paper/logopaper/ [PubMed] 20. Schneider TD. Information content of individual genetic sequences. J Theor Biol. 1997;189:427–441. http://www.ccrnp.ncifcrf.gov/toms/paper/ri/ [PubMed] 21. Schneider TD. Reading of DNA sequence logos: Prediction of major groove binding by information theory. Meth Enzym. 1996;274:445–455. http://www.ccrnp.ncifcrf.gov/toms/paper/oxyr/ [PubMed] 22. Bailey TL, Elkan C. The value of prior knowledge in discovering motifs with MEME. Proc Int Conf Intell Syst Mol Biol. 1995;3:21–29. [PubMed] 23. Tribus M. Princeton, N. J.: D. van Nostrand Company, Inc; 1961. Thermostatics and Thermodynamics. 24. Schneider TD, Stormo GD, Gold L, Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986;188:415–431. http://www.ccrnp.ncifcrf.gov/toms/paper/schneider1986/ [PubMed] 25. Schneider TD. Theory of molecular machines. II. Energy dissipation from molecular machines. J Theor Biol. 1991;148:125–137. http://www.ccrnp.ncifcrf.gov/toms/paper/edmm/ [PubMed] 26. Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, et al. Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 2000;11:4241–4257. [PubMed] 27. Fauchon M, Lagniel G, Aude JC, Lombardia L, Soularue P, et al. Sulfur sparing in the yeast proteome in response to sulfur demand. Mol Cell. 2002;9:713–723. [PubMed] 28. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998;95:14863–14868. [PubMed] 29. Schneider TD. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation. Nucleic Acids Res. 2001;29:4881–4891. http://www.ccrnp.ncifcrf.gov/toms/paper/baseflip/ [PubMed] 30. Niedenthal RK, Sen-Gupta M, Wilmen A, Hegemann JH. Cpf1 protein induced bending of yeast centromere DNA element I. Nucleic Acids Res. 1993;21:4726–4733. [PubMed] 31. Kent NA, Eibert SM, Mellor J. Cbf1p is required for chromatin remodeling at promoter-proximal CACGTG motifs in yeast. J Biol Chem. 2004;279:27116–27123. [PubMed] 32. Chiang DY, Nix DA, Shultzaberger RK, Gasch AP, Eisen MB. Flexible promoter architecture requirements for coactivator recruitment. BMC Mol Biol. 2006;7:16. [PubMed] 33. Basehoar AD, Zanton SJ, Pugh BF. Identification and distinct regulation of yeast TATA box-containing genes. Cell. 2004;116:699–709. [PubMed] 34. Lehman AM, Ellwood KB, Middleton BE, Carey M. Compensatory energetic relationships between upstream activators and the RNA polymerase II general transcription machinery. J Biol Chem. 1998;273:932–939. [PubMed] 35. Schneider TD. Evolution of biological information. Nucleic Acids Res. 2000;28:2794–2799. http://www.ccrnp.ncifcrf.gov/toms/paper/ev/ [PubMed] 36. GuhaThakurta D, Stormo GD. Identifying target sites for cooperatively binding factors. Bioinformatics. 2001;17:608–621. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||
Nature. 2003 Jul 10; 424(6945):147-51.
[Nature. 2003]Annu Rev Genet. 2000; 34():77-137.
[Annu Rev Genet. 2000]Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]EMBO J. 1998 Nov 2; 17(21):6327-36.
[EMBO J. 1998]Microbiol Mol Biol Rev. 1997 Dec; 61(4):503-32.
[Microbiol Mol Biol Rev. 1997]Nucleic Acids Res. 2007; 35(16):5275-83.
[Nucleic Acids Res. 2007]Proc Natl Acad Sci U S A. 2002 Jun 11; 99(12):8167-72.
[Proc Natl Acad Sci U S A. 2002]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 2007; 35(3):771-88.
[Nucleic Acids Res. 2007]Nucleic Acids Res. 1994 Nov 25; 22(23):4953-7.
[Nucleic Acids Res. 1994]Nucleic Acids Res. 1994 Aug 11; 22(15):3018-25.
[Nucleic Acids Res. 1994]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Mol Cell Biol. 1992 Apr; 12(4):1719-27.
[Mol Cell Biol. 1992]Microbiol Mol Biol Rev. 1997 Dec; 61(4):503-32.
[Microbiol Mol Biol Rev. 1997]EMBO J. 1997 May 1; 16(9):2441-51.
[EMBO J. 1997]EMBO J. 1998 Nov 2; 17(21):6327-36.
[EMBO J. 1998]Mol Cell Biol. 1997 Jul; 17(7):3640-8.
[Mol Cell Biol. 1997]Microbiol Mol Biol Rev. 1997 Dec; 61(4):503-32.
[Microbiol Mol Biol Rev. 1997]Nucleic Acids Res. 2001 Mar 1; 29(5):1054-60.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 1990 Oct 25; 18(20):6097-100.
[Nucleic Acids Res. 1990]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]Nucleic Acids Res. 1990 Oct 25; 18(20):6097-100.
[Nucleic Acids Res. 1990]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 2007; 35(3):771-88.
[Nucleic Acids Res. 2007]Nucleic Acids Res. 1994 Nov 25; 22(23):4953-7.
[Nucleic Acids Res. 1994]Nucleic Acids Res. 1994 Aug 11; 22(15):3018-25.
[Nucleic Acids Res. 1994]Nucleic Acids Res. 1983 Apr 25; 11(8):2237-55.
[Nucleic Acids Res. 1983]Nucleic Acids Res. 2007; 35(3):771-88.
[Nucleic Acids Res. 2007]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]J Theor Biol. 1991 Jan 7; 148(1):125-37.
[J Theor Biol. 1991]Microbiol Mol Biol Rev. 1997 Dec; 61(4):503-32.
[Microbiol Mol Biol Rev. 1997]Mol Biol Cell. 2000 Dec; 11(12):4241-57.
[Mol Biol Cell. 2000]Mol Cell. 2002 Apr; 9(4):713-23.
[Mol Cell. 2002]Proc Natl Acad Sci U S A. 1998 Dec 8; 95(25):14863-8.
[Proc Natl Acad Sci U S A. 1998]Methods Enzymol. 1996; 274():445-55.
[Methods Enzymol. 1996]Nucleic Acids Res. 2001 Dec 1; 29(23):4881-91.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 1993 Oct 11; 21(20):4726-33.
[Nucleic Acids Res. 1993]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 2007; 35(3):771-88.
[Nucleic Acids Res. 2007]Mol Biol Cell. 2000 Dec; 11(12):4241-57.
[Mol Biol Cell. 2000]Mol Cell. 2002 Apr; 9(4):713-23.
[Mol Cell. 2002]J Biol Chem. 2004 Jun 25; 279(26):27116-23.
[J Biol Chem. 2004]J Theor Biol. 1997 Dec 21; 189(4):427-41.
[J Theor Biol. 1997]J Mol Biol. 2001 Oct 12; 313(1):215-28.
[J Mol Biol. 2001]Nucleic Acids Res. 2007; 35(3):771-88.
[Nucleic Acids Res. 2007]BMC Mol Biol. 2006 Apr 28; 7():16.
[BMC Mol Biol. 2006]EMBO J. 1998 Nov 2; 17(21):6327-36.
[EMBO J. 1998]EMBO J. 1998 Nov 2; 17(21):6327-36.
[EMBO J. 1998]Annu Rev Genet. 2000; 34():77-137.
[Annu Rev Genet. 2000]Cell. 2004 Mar 5; 116(5):699-709.
[Cell. 2004]BMC Mol Biol. 2006 Apr 28; 7():16.
[BMC Mol Biol. 2006]J Biol Chem. 1998 Jan 9; 273(2):932-9.
[J Biol Chem. 1998]Nucleic Acids Res. 2007; 35(3):771-88.
[Nucleic Acids Res. 2007]BMC Mol Biol. 2006 Apr 28; 7():16.
[BMC Mol Biol. 2006]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Nucleic Acids Res. 2000 Jul 15; 28(14):2794-9.
[Nucleic Acids Res. 2000]Microbiol Mol Biol Rev. 1997 Dec; 61(4):503-32.
[Microbiol Mol Biol Rev. 1997]Mol Cell. 2002 Apr; 9(4):713-23.
[Mol Cell. 2002]J Mol Biol. 1986 Apr 5; 188(3):415-31.
[J Mol Biol. 1986]Bioinformatics. 2001 Jul; 17(7):608-21.
[Bioinformatics. 2001]J Biol Chem. 2004 Jun 25; 279(26):27116-23.
[J Biol Chem. 2004]