![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright : © 2006 Beyer et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Integrated Assessment and Prediction of Transcription Factor Binding 1 Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America 2 Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena, Germany 3 Leibniz Institute for Natural Product Research and Infection Biology, Hans Knöll Institute, Jena, Germany Peer Bork, Editor EMBL Heidelberg, Germany * To whom correspondence should be addressed. E-mail: beyer/at/fli-leibniz.de Received January 25, 2006; Accepted May 8, 2006. This article has been cited by other articles in PMC.Abstract Systematic chromatin immunoprecipitation (chIP-chip) experiments have become a central technique for mapping transcriptional interactions in model organisms and humans. However, measurement of chromatin binding does not necessarily imply regulation, and binding may be difficult to detect if it is condition or cofactor dependent. To address these challenges, we present an approach for reliably assigning transcription factors (TFs) to target genes that integrates many lines of direct and indirect evidence into a single probabilistic model. Using this approach, we analyze publicly available chIP-chip binding profiles measured for yeast TFs in standard conditions, showing that our model interprets these data with significantly higher accuracy than previous methods. Pooling the high-confidence interactions reveals a large network containing 363 significant sets of factors (TF modules) that cooperate to regulate common target genes. In addition, the method predicts 980 novel binding interactions with high confidence that are likely to occur in so-far untested conditions. Indeed, using new chIP-chip experiments we show that predicted interactions for the factors Rpn4p and Pdr1p are observed only after treatment of cells with methyl-methanesulfonate, a DNA-damaging agent. We outline the first approach for consistently integrating all available evidences for TF–target interactions and we comprehensively identify the resulting TF module hierarchy. Prioritizing experimental conditions for each factor will be especially important as increasing numbers of chIP-chip assays are performed in complex organisms such as humans, for which “standard conditions” are ill defined. Synopsis Transcription factors (TFs) bind close to their target genes for regulating transcript levels depending on cellular conditions. Each gene may be regulated differently from others through the binding of specific groups of TFs (TF modules). Recently, a wide variety of large-scale measurements about transcriptional networks has become available. Here the authors present a framework for consistently integrating all of this evidence to systematically determine the precise set of genes directly regulated by each TF (i.e., TF–target interactions). The framework is applied to the yeast Saccharomyces cerevisiae using seven distinct sources of evidences to score all possible TF–target interactions in this organism. Subsequently, the authors employ another newly developed algorithm to reveal TF modules based on the top 5,000 TF–target interactions, yielding more than 300 TF modules. The new scoring scheme for TF–target interactions allows predicting the binding of TFs under so-far untested conditions, which is demonstrated by experimentally verifying interactions for two TFs (Pdr1p, Rpn4p). Importantly, the new methods (scoring of TF–target interactions and TF module identification) are scalable to much larger datasets, making them applicable to future studies in humans, which are thought to have substantially larger numbers of TF–target interactions. Introduction Combinatorial transcriptional regulation is an important means of achieving highly specific expression of individual genes using small groups of transcription factors (TFs) [1–7]. These groups, called TF modules [3–6], integrate signals from different pathways to fine-tune the cellular response at the transcriptional level. The complexity of transcriptional regulation in higher species suggests that combinatorial regulation is of particular importance for metazoans [5,8]. However, detecting biologically significant TF modules is only possible if the gene targets regulated by each TF are known with high accuracy. Recently, measurement of TF–target binding relationships has become much more systematic through the technique of chromatin immunoprecipitation coupled with microarray chips (chIP-chip) [9–11]. By this approach, a TF of interest is immunoprecipitated along with all of the gene promoters and other genome fragments it binds in vivo; these fragments are identified by hybridization to a DNA microarray, thus elucidating all of the promoters bound directly by that TF. However, observed DNA binding in an upstream region alone is not always sufficient to indicate true interaction between a TF and a potential target gene [11,12]. Even if binding occurs, the event may not be biologically relevant, or the observed binding may relate to some cellular function other than gene expression. Moreover, unlike genome sequencing, which has a well-defined endpoint, interaction mapping projects are difficult to “complete” because a cell's pattern of interactions is strongly dependent on variables such as the cell type, genetic background, stage of development, time after stimulus, or specific environmental or biological condition. Accordingly, many true binding events may be missed by chIP-chip because the relevant conditions have not yet been examined. Therefore, to correctly interpret measurements of TF–target binding, there is a need for computational methods that (1) identify which binding interactions have a regulatory function; (2) provide insight into new TF–target relationships that are likely to be condition-specific; and (3) perform an efficient yet exhaustive identification of TF modules, including quantification of their statistical significance. Existing bioinformatic approaches for assigning TFs to target genes rely on stepwise integration of one or a few lines of evidence, such as combining chIP-chip data [11] with TF binding motifs or coexpression [3,4,13–18]. Other approaches combine TF binding locations with diverse biological data to infer regulatory networks [19,20], but require the prior assignment of interactions and interaction probabilities. Here, we implement a Bayesian approach that integrates all available types of genome-scale evidence to construct accurate transcriptional regulatory networks. In addition to measurement of direct promoter binding and detection of DNA binding motifs, we find that evidence of gene fusion and shared phylogenetic profiles (i.e., co-occurrence in a significant number of species) is surprisingly informative for predicting true regulatory interactions. High-confidence interactions are used to identify TF modules (i.e., sets of TFs that cooperate to regulate a significant number of genes in common). Application of this procedure to integrate genome-scale data for yeast reveals a large hierarchical network of regulatory relationships and predicts many new condition-specific transcriptional interactions. We validate several of these interactions through new chIP-chip experiments for Rpn4p and Pdr1p, two transcription factors predicted to bind many new targets in response to chemical stress. Incorporation of these new binding data into modules reveals cross-talk between TFs involved in the response to stress, histone regulation, and regulation of the cell cycle. Results/Discussion Overview of the Approach To permit the construction of accurate transcriptional networks, we developed an integrative framework to quantify the likelihood of direct regulatory interaction between a TF and each of its possible target genes (Figure 1
Given a method for assessing interaction reliability, we also sought to organize high-confidence interactions into TF modules (i.e., sets of TFs that cooperatively regulate sets of genes). For this purpose, we applied an algorithm that identified all TF combinations regulating common targets and assigned p-values of significance to these overlaps [26]. This method of module identification is scalable to much larger datasets, which will be particularly necessary in view of the complex transcriptional regulation observed in higher eukaryotes [8]. Given a set of TF modules, the integrated LLSs could be subsequently refined in a process that examined the overlap between modules and gene expression clusters. Further details on the Bayesian integration and the module identification procedures are provided in Materials and Methods. Diverse Evidence Types Are Informative of True TF Interactions We applied our integrative Bayesian approach to assign confidence scores to every potential TF–target pair in yeast. Seven distinct lines of evidence were made available to the model (Figure 1
Four types of 2hops were examined, in which the first hop (X → Y) was always measured by TF binding (evidence B), and the second hop (Y ↔ Z) was supported by evidences E, P, Y, or G, giving 2hops BE, BP, BY, and BG, respectively (Figure 2 Combining all lines of evidence (B, S, O, C, and 2hops) yielded a total of 7,817 high-confidence interactions with integrated LLS > 5 (Table S1). We found that the distinction of known true and false interactions could be further improved by requiring that one of the evidences for DNA binding (B, S) and one evidence for functional interaction (O, BE, BP, BY, BG, C) have an LLS > 0.5 (Figure 3
2hops were informative for scoring a substantial number of putative transcriptional interactions (Table S1). For instance, for 359 high-confidence predictions (LLS > 5), the underlying evidence was based exclusively on 2hops and membership in a coexpression cluster, without observed chIP-chip binding and without significant binding motifs. By the same criterion, another 419 (8%) interactions with significant observed binding were supported only by 2hops or cluster membership but not by DNA binding motifs. Given the absence of observed motifs, it is possible that these TFs do not directly bind DNA but serve as cofactors together with DNA-binding TFs. Two well-known examples of cooperative regulation “at a distance” are the histone regulators Hir1p and Hir2p [30]. Based largely on 2hops, our model obtained very consistent evidence for interactions connecting these two factors to eight histone-related genes (Table 1).
TF Module Hierarchy Pooling high-confidence TF–target interactions revealed a total of 363 significant TF modules (pmod < 10−4), each of which contained two to 13 distinct TFs. Examples of identified modules are shown in Figure 4
Figures 4
Benchmarking and Comparison to Previous Approaches We next compared the integrative approach (“Bayes”) to two previous methods, one based on a chIP-chip binding measurement alone (“binding only”) [10,11], and the other requiring the presence of a conserved TF binding motif in addition to observed binding (“binding + motif”) [11]. In a two-fold cross-validation we randomly split the reference interactions into two datasets (A and B) of equal size. Subsequently, we used A to train the statistical model and tested it on B and vice versa. Figure 3 While the receiver operator characteristic (ROC) curves imply better coverage of our approach, we also wanted to assess the quality of these predictions. If several target genes are regulated by the same TF, one might expect these genes to be coexpressed and to have similar cellular functions. This notion provided a means to benchmark the integrative Bayes classifier versus the other methods for TF–target assignment. Figure 3 Prediction of New Transcriptional Interactions Beyond assigning confidences to raw interaction measurements, we investigated whether an integrative approach could predict interactions that had not yet been observed experimentally. Overall, our high-confidence set of 5,124 TF–target pairs included 980 interactions that were based on multiple lines of evidence but were not supported by direct chIP binding (LLS for binding < 0.05). We hypothesized that for many of these TF–target pairs, direct binding might indeed occur but in conditions that had not been previously measured. Although the available chIP binding data included profiles for most TFs in nominal conditions (YPD media), few of these factors had been examined in more than one to two other conditions [11]. To test our hypothesis, we applied a cross-validation procedure in which LLS values were recalculated using only chIP data from nominal conditions, and the resulting TF–target pairs with high LLSs were compared with the available binding measurements from other growth conditions. As shown in Figure 3 Discovery and Validation of Rpn4p and Pdr1p Transcriptional Reprogramming Encouraged by the above cross-validation results, we sought to experimentally verify several of the interactions predicted to operate under new conditions. Rpn4p and Pdr1p exhibit significant transcriptional reprogramming under oxidative stress (Figure 5
Of the 104 predicted interactions (LLS > 4) for Rpn4p and Pdr1p that did not have prior chIP-chip binding evidence, 19 had significant p-values of binding under MMS (Table 3; overlap is significant at p = 1.7 × 10−7, hypergeometric distribution). Accordingly, Figure S2 shows that TF–target pairs observed under MMS tend to also have high LLS values according to the Bayes classifier. Thus, the LLS can predict novel DNA binding interactions, even if no such binding has been observed previously.
Figure 5 Relating the new MMS binding data to TF modules suggested that, although both TFs respond to DNA damage, they regulate distinct sets of genes in a nonredundant manner. First, Pdr1p and Rpn4p were never present in the same module (they had no common targets at LLS > 5). Instead, Pdr1p formed a TF module with Pdr3p (Table 2), reflecting an earlier observation that Pdr1p and Pdr3p can bind as homo- or heterodimers to the same binding sites [33]. On the other hand, Rpn4p shared targets predominantly with other stress-related TFs such as Yap1p or Yap7p. Either Pdr1p or Rpn4p could form a module with the cell-cycle regulator Cbf1p (Figure 5 Rpn4p and Pdr1p exhibit distinct stress response schemes. While Rpn4p primarily binds under stress conditions but not under nominal conditions, Pdr1p binds a large fraction of its targets under nominal conditions. These binding sites are released by Pdr1p under stress (clusters b-ii and b-iii). A second group of Pdr1p targets comprises genes that are unbound under any of the tested conditions (cluster b-i) or just weakly bound under MMS stress (cluster b-iv). Binding of Pdr1p was not observed for any of its significant (LLS > 5) targets under oxidative stress. The distinct regulatory patterns are at least partially explained by cofactors that act in concert with Rpn4p or Pdr1 in a modular fashion. For instance, clusters b-i and b-iv are regulated by Pdr1 and Pdr3p together, whereas clusters b-ii and b-iii contain no targets of Pdr3p. A consistent pattern emerged indicating that genes regulated by Pdr1p but not by Pdr3p are bound under nominal conditions, whereas those regulated by the Pdr1p/Pdr3p complex are not. In support of previous speculations our findings suggest that dimer composition affects binding site specificity of Pdr1p and Pdr3p [33]. Conclusions In summary, we have developed an approach for assigning likelihood scores to transcriptional interactions based on integration across eight types of direct and indirect evidence. The integration of different lines of evidence serves two major purposes: first, if binding was already observed in chIP-chip experiments, additional evidence helps reduce the number of false positive predictions by verifying that the interaction between a TF and its target gene is functional. Secondly, if no binding has been observed, other evidences may reduce false negative predictions and suggest that interactions may occur under so-far untested conditions. Based on the latter, we were able to experimentally confirm 19 new transcriptional interactions that are active during damage-related stress. We have also explored how high-confidence TF–target interactions can be used to infer the TF module hierarchy underlying transcriptional gene regulation. In this regard, our analysis of modules involving Pdr1p, Rpn4p, Hir1p, and Hir2p suggested how cells achieve a high degree of specificity by combining generic factors with other more specific factors into complex regulatory units. Although we have focused on yeast, the framework is general and may be especially relevant as large-scale transcriptional mapping projects get under way in humans. Materials and Methods Control sets. A set of 484 high-confidence TF–target interactions was created as a positive control by extracting regulatory interactions from the Incyte YPD Database (http://www.incyte.com), which is based on a curated, literature-derived dataset. In order to obtain negative control data, five sets of random TF–gene associations were generated, where each set contained > 9,000 interactions. Cocitation criteria [21] were applied to further enhance the stringency of the control sets: interactions in the positive control set were required to have a significantly enriched number of cocitations (p < 0.1), whereas interactions in the negative control set were required not to have a cocitation link. Likelihood ratios. LLSs were calculated as described in Lee et al. [21]:
) is the prior probability of not observing a true TF–target interaction, P(L)/P( ) is the prior likelihood ratio for observing a true interaction, and P(L|E)/P( |E) is the posterior likelihood ratio after observing the evidence E. Input evidences were binned ensuring that bins for one evidence type always have the same size and LLSs were calculated for each bin based on the positive and negative training sets. The best regression (maximizing R2) through the resulting LLSs was used for predicting LLSs of unknown TF–gene pairs (Figure 2Adding LLSs of different evidences is appropriate if the input data are statistically independent. Lee et al. [21] propose a weighing factor D accounting for dependence between evidences. We used ROC curves and Mathew's correlation coefficient (MCC) [36] to judge the quality of different weighing factors; the best performance was obtained by simple addition of LLSs (i.e., D = 1). Filtering positive from negative interactions was improved by additionally requiring that at least one of the two evidences (B or S) and one of the remaining evidence types have LLS > 0.5. If written as pseudocode, this rule reads: SELECT IF( (B > 0.5 OR S > 0.5) AND (O > 0.5 OR TE > 0.5 OR…OR C > 0.5) ). The rationale for this grouping of evidences is that B and S provide evidence for (possible) upstream binding, but they do not imply a true regulatory interaction. The remaining evidences, on the other hand, functionally link the TF to its target gene. Note that evidence O implies that the binding site is conserved upstream of the respective gene in a significant number of species, which also suggests a functional interaction. Different LLS thresholds were tested in steps of ΔLLS = 0.05 to maximize the MCC. The threshold at 0.5 was found to maximize the MCC. The final predictions were based on the sum of all evidences (Figure 3Coexpression evidence. The score for TF–target interactions was computed over two steps. In the first step, an initial set of target genes (LLS > 6) and TF modules was identified with high confidence. In the second step, these high-confidence target sets were used to search for additional targets that were coexpressed with the existing ones, in a manner similar to Bar-Joseph et al. [4] For the second step, expression data covering a broad range of cellular functions (Figure 1 Binding motifs. TF binding site motifs were defined as position-specific weight matrices (PWMs) [37]. PWMs were compiled for 111 different individual TFs from Harbison et al. [11] and from public databases [38,39]. When more than one matrix was defined for the same TF, the PWM with the highest information content per position (relative entropy) was selected. Using the PWM scoring functionality of ANN-Spec [37], the score distribution for each motif was determined over possible subsequences of all intergenic regions such that a score threshold could be selected to ensure that the fraction of predicted binding sites was < 10−4. The genome sequences of promoter regions in S. cerevisiae (2,000 bp upstream of each gene) and of the promoters of homologous genes in four other sensu stricto species (1,000 bp upstream of each homolog) were obtained from SGD [31] (download from June 2005; Washington University, St. Louis, Missouri, United States and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States). Log-likelihood ratios were calculated separately for every species and the LLSs of the four related species were added into one LLS for the evidence O (“binding motif in other yeast species”). 2hops. A 2hop relationship exists between a transcription factor, A, and a gene, C, via an intermediate gene, B, if there is evidence that A regulates B, and B is functionally linked to C. The two evidence types are then transformed into respective likelihood scores LSAB and LSBC. The product of the two LSs is proportional to the product of the posterior likelihood ratios:
Note that the denominator in Equation 1 is essentially the fraction of true interactions among all possible interactions [21]; hence, it is a constant for all interaction pairs AB and BC. Therefore, EAC from Equation 2 is proportional to the probability that the network path from A via B to C actually exists given the evidences EAB and EBC. Thus, EAC served as evidence for a direct link from A to C and the likelihood ratio LSAC was calculated from EAC based on the training data in the same way as for all other input evidences. In this study LSAB was always based on chIP-binding p-values, and LSBC was taken from Lee et al. [21]. Transcription factor modules and p-value estimation. TF modules were determined using our previously described method [26], yielding closed sets of TFs associated with distinct sets of target genes. The p-values quantify the likelihood of observing the given TF module in a randomized regulatory network of the same size and same number of TFs. Briefly, a TF module M is defined as a set of n distinct transcription factors (m1,… , mi, …, mn). Let F be the total number of all TF–target interactions and fi be the number of interactions from mi to its target genes in the entire network. We then compute the relative frequency of mi as i = fi/F. A random set of n TFs has n! different permutations and thus, the probability of finding M in a random set of n transcription factors is n! × Π i. Note that this implicitly assumes that the probability of drawing a TF i is independent of the other TFs in the module. This assumption may be violated for small numbers of TFs, because the probability of drawing one TF would then depend on the TFs that have already been withdrawn from the “pool.” In our case we have >100 different TFs. We compared the direct estimation of the p-values with random permutations of the TF–target interactions to verify that the pool size does not affect the p-values. We observed no significant deviations between the two schemes (data not shown). Next, it is possible to calculate the probability, pM, of finding M in the set of k ≥ n TFs that regulate a given gene. This pM can be computed as 1 minus the probability of not finding M in
random trials:
Note that a TF occurs at most once in every set. The pM are calculated for all set sizes k appearing in the original (observed) data and a weighed sum PM is calculated as
Apart from being scalable, our approach has a number of advantages in comparison to existing algorithms for finding TF modules [4,15,18,40–42]: (1) all available evidences can be integrated into one common score; (2) the variable predictive power of different evidences is taken into account; (3) there is no size threshold on the number of TFs in each module; (4) all modules at all hierarchical levels are identified, without the need to restrict the search to a specific hierarchical level (“slicing”; see also [18]); (5) target genes and TFs can be members of several modules; (6) the algorithm is not restricted to TFs with known binding matrices; and (7) we assign a p-value to every TF module based on the number of target genes. The importance of some of these aspects has been discussed previously [18,42]. Existing approaches cover some of these features (e.g., genetic regulatory modules (GRAM) [4] fulfills 5 and 6 or the extended signature algorithm [42] agrees with 3, 4, and 5). Features 1, 2, and 7 are unique to our method. Coexpression clustering and centrality. Microarray mRNA expression data were taken from the literature [1,43–45] (23 different conditions, 310 profiles). Genes were clustered separately for each study or group of conditions (i.e., cell cycle, stress-related, metabolism) using only genes that changed significantly (standard deviation of log2-fold change > 0.45) [46]. Gene clusters were obtained using a multistep procedure that determines the total number of clusters (k) and the cluster membership of each gene. Within each step, clustering is performed using the fuzzy c-means algorithm [46], which estimates the probability of membership of every gene to each cluster. The initial k was set to the largest value allowed for the given dataset (3 times the number of profiles); all other parameters were set to default values. Genes with membership values less than 0.2 were removed from the respective clusters. We define “centrality” as the average membership value of a subset SC of a cluster C normalized by the average of all memberships for cluster C. Genome-wide chIP-chip analysis. Haploid W303-derived strains harboring either rpn4 or pdr1 tagged with the cMyc epitope were obtained from the laboratory of Dr. Richard A. Young at the Whitehead Institute for Biomedical Research (Cambridge, Massachusetts, United States). Cells were grown to log-phase in YPD media at 30 °C, then treated with 0.03% MMS for 1 h. Protein-DNA binding locations were assayed using a chIP-chip protocol previously described [10] with corresponding IP-enriched and unenriched samples cohybridized to a single cDNA microarray containing all yeast intergenic sequences derived from PCR amplification. Microarray data were analyzed using the VERA error-modeling package [47] to generate p-values of TF binding for each promoter region. Unlike for gene expression analysis, in which both increases and decreases in fluorescent intensity are of interest, DNA binding is indicated for increases only, representing increased promoter binding in the IP-enriched versus IP-unenriched sample. We therefore modified the VERA likelihood ratio test to use a one-sided statistic by forcing μx > μy in the denominator of Equation 5 of Ideker et al. [47]. To derive p-values from the log-likelihood ratio statistic, we indexed values on the cumulative distribution for a negative control experiment: IP-unenriched versus IP-unenriched over three replicate microarrays. Supplementary data files can also be downloaded from the accompanying Web site at http://www.fli-leibniz.de/tsb/tfb. Figure S1: Functional Homogeneity and Coexpression of Target Sets Using Annotations Based on Munich Information Center for Protein Sequences and SGD (A) Using selections at specificity = 0.995 (i.e., LLS > 5). (B) Using selections at specificity = 0.997 (i.e., LLS > 6). See Figure 3 (571 KB PNG) Click here for additional data file.(572K, png) Figure S2: Average LLS versus Binding p-Value under MMS Stress Sliding window average of LLS is shown. Horizontal lines are average LLSs over all genes. LLSs were determined without MMS binding data, but using all data from Harbison et al. [11]. LLSs are significantly increasing with decreasing binding p-values (Pdr1p: p = 2 × 10−8; Rpn4p: p=3× 10−25; two-sided t test for difference of the correlation coefficient from zero). (218 KB PNG) Click here for additional data file.(218K, png) (240 KB PNG) Click here for additional data file.(240K, png) Table S1: All TF–Target Interactions with LLS > 4 First column contains TF name, second column contains target gene, and columns 3 to 11 contain the respective evidences expressed as LLS. The last column contains the sum of the individual LLS. An extended table including TF–target pairs at lower thresholds can be downloaded from http://www.fli-leibniz.de/tsb/tfb. (539 KB PDF) Click here for additional data file.(540K, pdf) Table S2: TF Modules for LLS Threshold 4 Module names are followed by module size, number of target genes, and pmod. An extended table including the target genes can be downloaded from http://www.fli-leibniz.de/tsb/tfb. (553 KB EPS) Click here for additional data file.(554K, eps) Table S3: TF Modules for LLS Threshold 5 Module names are followed by module size, number of target genes and pmod. An extended table including the target genes can be downloaded from http://www.fli-leibniz.de/tsb/tfb. (272 KB EPS) Click here for additional data file.(273K, eps) Table S4: Significant Overlaps (p < 10−4) between Target Gene Sets and Coexpression Clusters Target gene sets are targets of either individual TFs or TF modules. All TF–target interactions have an LLS > 5. Clusters were determined with fuzzy c-means (see Materials and Methods). Genes with membership values < 0.2 were excluded from the clusters. Significance of overlaps was determined assuming a hypergeometric distribution. (46 KB PDF) Click here for additional data file.(46K, pdf) Table S5: Positive Control Set of TF–Target Interactions The negative control sets can be downloaded from http://www.fli-leibniz.de/tsb/tfb. (7 KB PDF) Click here for additional data file.(7.6K, pdf) Acknowledgments We wish to thank Dr. Ziv Bar-Joseph (Carnegie Mellon University) for providing us his deconvolved cell-cycle expression data and for helpful comments. Abbreviations
Footnotes Author contributions. AB and TI conceived and designed the experiments. CW performed the experiments. AB, CW, JH, DR, and UM analyzed the data. AB, CW, UM, TW, and TI wrote the paper. Funding. AB's stay at UCSD was funded by the German Academic Exchange Service. Additional funding was provided by the German Federal Ministry for Education and Research (BMBF), grant 0312704E. Competing interests. The authors have declared that no competing interests exist. A previous version of this article appeared as an Early Online Release on May 8, 2006 (DOI: 10.1371/journal.pcbi.0020070.eor). References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Mol Biol Cell. 2000 Dec; 11(12):4241-57.
[Mol Biol Cell. 2000]Science. 2001 May 4; 292(5518):929-34.
[Science. 2001]Nucleic Acids Res. 2005; 33(2):605-15.
[Nucleic Acids Res. 2005]Proc Natl Acad Sci U S A. 2005 Feb 8; 102(6):1998-2003.
[Proc Natl Acad Sci U S A. 2005]Nat Struct Mol Biol. 2004 Sep; 11(9):812-5.
[Nat Struct Mol Biol. 2004]Cell. 2001 Sep 21; 106(6):697-708.
[Cell. 2001]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Mol Cell Biol. 2005 Mar; 25(6):2138-46.
[Mol Cell Biol. 2005]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Nucleic Acids Res. 2005; 33(2):605-15.
[Nucleic Acids Res. 2005]Nat Biotechnol. 2003 Nov; 21(11):1337-42.
[Nat Biotechnol. 2003]Genome Biol. 2004; 5(8):R56.
[Genome Biol. 2004]Bioinformatics. 2005 Sep 1; 21 Suppl 2():ii197-203.
[Bioinformatics. 2005]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Nat Biotechnol. 2005 Aug; 23(8):951-9.
[Nat Biotechnol. 2005]Proc Natl Acad Sci U S A. 2004 Nov 2; 101(44):15682-7.
[Proc Natl Acad Sci U S A. 2004]Proteomics. 2005 May; 5(8):2082-9.
[Proteomics. 2005]Nature. 2003 Jul 10; 424(6945):147-51.
[Nature. 2003]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Proc Natl Acad Sci U S A. 2004 Nov 2; 101(44):15682-7.
[Proc Natl Acad Sci U S A. 2004]Science. 2001 May 4; 292(5518):929-34.
[Science. 2001]Nat Genet. 2002 Aug; 31(4):370-7.
[Nat Genet. 2002]Proteomics. 2005 May; 5(8):2082-9.
[Proteomics. 2005]Proc Natl Acad Sci U S A. 2004 Jun 15; 101(24):9033-8.
[Proc Natl Acad Sci U S A. 2004]Science. 1999 Jul 30; 285(5428):751-3.
[Science. 1999]Mol Cell Biol. 1997 Feb; 17(2):545-52.
[Mol Cell Biol. 1997]Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Nature. 1997 May 29; 387(6632 Suppl):67-73.
[Nature. 1997]Nucleic Acids Res. 1997 Jan 1; 25(1):28-30.
[Nucleic Acids Res. 1997]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Mol Microbiol. 2002 Dec; 46(5):1429-40.
[Mol Microbiol. 2002]Mol Microbiol. 2002 Mar; 43(5):1295-308.
[Mol Microbiol. 2002]DNA Repair (Amst). 2003 Jan 2; 2(1):73-89.
[DNA Repair (Amst). 2003]Mol Microbiol. 2002 Dec; 46(5):1429-40.
[Mol Microbiol. 2002]Mol Biol Cell. 2000 Dec; 11(12):4241-57.
[Mol Biol Cell. 2000]Mol Microbiol. 2002 Dec; 46(5):1429-40.
[Mol Microbiol. 2002]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Biochim Biophys Acta. 1975 Oct 20; 405(2):442-51.
[Biochim Biophys Acta. 1975]Nat Biotechnol. 2003 Nov; 21(11):1337-42.
[Nat Biotechnol. 2003]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Bioinformatics. 1999 Jul-Aug; 15(7-8):607-11.
[Bioinformatics. 1999]Nucleic Acids Res. 2001 Jan 1; 29(1):281-3.
[Nucleic Acids Res. 2001]Nature. 1997 May 29; 387(6632 Suppl):67-73.
[Nature. 1997]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Proteomics. 2005 May; 5(8):2082-9.
[Proteomics. 2005]Nat Biotechnol. 2003 Nov; 21(11):1337-42.
[Nat Biotechnol. 2003]Nat Genet. 2002 Aug; 31(4):370-7.
[Nat Genet. 2002]Bioinformatics. 2005 Sep 1; 21 Suppl 2():ii197-203.
[Bioinformatics. 2005]Bioinformatics. 2003; 19 Suppl 1():i273-82.
[Bioinformatics. 2003]Bioinformatics. 2004 Sep 1; 20(13):1993-2003.
[Bioinformatics. 2004]Bioinformatics. 2005 Sep 1; 21 Suppl 2():ii197-203.
[Bioinformatics. 2005]Bioinformatics. 2004 Sep 1; 20(13):1993-2003.
[Bioinformatics. 2004]Nat Biotechnol. 2003 Nov; 21(11):1337-42.
[Nat Biotechnol. 2003]Mol Biol Cell. 2000 Dec; 11(12):4241-57.
[Mol Biol Cell. 2000]Mol Biol Cell. 2001 Oct; 12(10):2987-3003.
[Mol Biol Cell. 2001]Bioinformatics. 2004 Aug 4; 20 Suppl 1():i23-30.
[Bioinformatics. 2004]Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]J Comput Biol. 2000; 7(6):805-17.
[J Comput Biol. 2000]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Nature. 1997 May 29; 387(6632 Suppl):67-73.
[Nature. 1997]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]