![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2007, Cold Spring Harbor Laboratory Press Functional conservation of Rel binding sites in drosophilid genomes 1 Wellcome Trust Centre for Human Genetics, Oxford University, Oxford OX3 7BN, United Kingdom; 2 Molsoft L.L.C. La Jolla, California 92037, USA; 3 Kennedy Institute of Rheumatology, Imperial College, London W6 8LH, United Kingdom 4Corresponding authors.E-mail i.udalova/at/imperial.ac.uk; fax 44-208-3834499.E-mail copley/at/well.ox.ac.uk; fax 44-1865-287664. Received March 13, 2007; Accepted June 21, 2007. Freely available online through the Genome Research Open Access option. Abstract Evolutionary constraints on gene regulatory elements are poorly understood: Little is known about how the strength of transcription factor binding correlates with DNA sequence conservation, and whether transcription factor binding sites can evolve rapidly while retaining their function. Here we use the model of the NFKB/Rel-dependent gene regulation in divergent Drosophila species to examine the hypothesis that the functional properties of authentic transcription factor binding sites are under stronger evolutionary constraints than the genomic background. Using molecular modeling we compare tertiary structures of the Drosophila Rel family proteins Dorsal, Dif, and Relish and demonstrate that their DNA-binding and protein dimerization domains undergo distinct rates of evolution. The accumulated amino acid changes, however, are unlikely to affect DNA sequence recognition and affinity. We employ our recently developed microarray-based experimental platform and principal coordinates statistical analysis to quantitatively and systematically profile DNA binding affinities of three Drosophila Rel proteins to 10,368 variants of the NFKB recognition sequences. We then correlate the evolutionary divergence of gene regulatory regions with differences in DNA binding affinities. Genome-wide analyses reveal a significant increase in the number of conserved Rel binding sites in promoters of developmental and immune genes. Significantly, the affinity of Rel proteins to these sites was higher than to less conserved sites and was maintained by the conservation of the DNA binding site sequence (static conservation) or in some cases despite significantly diverged sequences (dynamic conservation). We discuss how two types of conservation may contribute to the stabilization and optimization of a functional gene regulatory code in evolution. Despite the availability of whole-genome sequences of related species, the evolution of transcriptional regulation is still poorly understood (Ludwig 2002; Xie et al. 2005). Transcription is controlled by the binding of “regulatory proteins” (i.e., transcription factors, TFs) to “DNA regulatory elements” (i.e., promoters and transcription enhancers). Functionally important regulatory sequences are usually conserved among related species. Indeed, in a few examples of well-characterized transcriptional enhancers, such as the “even-skipped” stripe 2 enhancer (Stanojevic et al. 1991), most but not all functionally important binding sites are conserved in 13 Drosophila species (Ludwig et al. 1998, 2000). On the other hand, the preservation of an optimal level of gene expression may allow and even support changes in regulatory sequences, where there are compensatory changes in transcription factors and the regulatory sequences (Landry et al. 2005). Such compensatory changes can include amino acid substitutions in the DNA-binding domains of transcription factors that alter the pattern of DNA sequence recognition and reciprocal changes in DNA regulatory sequences (Juarez et al. 2000). Another factor that may lead to changes in DNA regulatory sequences is “fuzziness” of a TF binding site itself. Certain transcription factors can bind to a number of closely related DNA sequences with similar affinities; this may allow for neutral evolution of binding sites without significant effects on TF binding (Gerland and Hwa 2002). A major impediment to understanding the contribution of binding site sequence “fuzziness” in the evolution of regulatory DNA sequences is the lack of systematic, accurate, and quantitative measurements of binding affinities to binding site sequence variants. Recently we and others used a high-throughput microarray-based assay to address this problem (Bulyk et al. 2001; Linnell et al. 2004; Mukherjee et al. 2004). We have also developed the principal coordinate (PC) statistical analysis for analyzing protein–DNA interaction data (Udalova et al. 2002). The PC analysis accurately predicts the effect of nucleotide variations within the binding motif on protein binding affinity and automatically incorporates the effects of interactions between base pair positions in the binding site. The resulting comprehensive tables of binding affinities improve on traditional position-weight-matrix models that may fail to depict true binding specificities because they assume that each nucleotide in a binding site exerts an independent effect (Benos et al. 2002; Bulyk et al. 2002). The Drosophila Dorsal, Dif, and Relish proteins are three members of the Rel Homology Domain (RHD) containing class of transcription factors represented in humans by the NFKB family (Silverman and Maniatis 2001). During Drosophila development, Dorsal has a key role in initiating the dorsal-ventral patterning pathway, and its target genes and their enhancers have been well studied in this context (Stathopoulos et al. 2002; Papatsenko and Levine 2005; Biemar et al. 2006). Most notably, from an evolutionary standpoint, Papatsenko and Levine (2005) have analyzed Dorsal binding sites in 18 target gene enhancers, across four species of Drosophila, and demonstrated that 80% of optimal (high affinity) binding motifs are within evolutionally conserved sequence blocks. Dorsal, Dif, and Relish also play a critical role in the innate immune response, a function they share with NFKB in mammals. The innate immune response is an ancient evolutionary defense mechanism against microbial pathogens that is conserved from Drosophila to mammals (Hoffmann et al. 1999; Silverman and Maniatis 2001). When challenged by microbes, insects discriminate between various classes of microorganisms by activating specific intracellular signaling pathways that lead to production of antimicrobial peptides and other effector molecules (Hoffmann 2003). The Toll signaling pathway is activated in response to Gram-positive bacteria and fungal infection, and leads to the nuclear translocation of Dorsal and Dif. These transcription factors bind to DNA sequences upstream from genes encoding a large number of antifungal peptides such as drosomycin and metchnikowin. An alternate signaling pathway is activated in response to Gram-negative bacteria and results in the proteolytic cleavage of the precursor for Relish. Processed Relish translocates into the nucleus and activates expression of anti-bacterial peptides, such as diptericin and attacins. Little is known about the conservation of Rel binding sites in the orthologous enhancers of the innate immunity genes. Here we investigate the molecular evolution of RHD-containing proteins and their associated binding sites using genome sequence assemblies of seven Drosophila species (D. melanogaster, D. simulans, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis, D. mojavensis). We model the structures of Dorsal, Dif, and Relish based on available crystallographic structures of the mammalian c-Rel protein, and examine the possibility that changes in amino acid sequences in the DNA-binding domains during evolution alter the pattern of DNA sequence recognition and binding affinity. We generate quantitative binding affinity data for these proteins using a microarray-based binding assay and the PC analysis (Udalova et al. 2002) and examine the changes in DNA sequences and binding affinities of putative Rel binding sites on a genome-wide basis. We demonstrate that the Rel binding sites in the vicinity of innate immunity and developmental genes are under strong functional constraints. Our work extends that of Papatsenko and Levine (2005) by investigating the evolutionary dynamics within particular conserved Rel binding sites, and by examining the differences between those sites that are associated with target and nontarget genes. We address two key questions: (1) Does binding site affinity correlate with DNA sequence conservation and (2) is binding site conservation “dynamic”—that is, can it show high levels of nucleotide substitution while maintaining functionality? Results Conservation of DNA-contacting residues in Rel Homology Domains of Dorsal, Dif, and Relish Dorsal, Dif, and Relish proteins share a homologous region, the Rel Homology Domain (RHD), of ~300 amino acids (aa), that is responsible for DNA-binding and dimerization (this region is covered by the Pfam domains RHD and TIG [Finn et al. 2006], but, in common with others, we use RHD to refer to this entire region). We investigated the likely functional consequences of amino acid changes in the RHD during evolution of drosophilids, by first predicting the gene sequences of Dorsal, Dif, and Relish orthologs in each species, as described in the Methods, and then mapping protein sequence divergence within ortholog and paralog sets to known 3D structures. No structures of complete insect RHD-containing proteins, i.e., including both the Pfam RHD and TIG domains, are currently available (the structure of Gambif1 from mosquito does not cover the TIG; Cramer et al. 1999). Consequently we modeled the 3D structure of Dorsal, Dif, and Relish homodimers bound to DNA using the structure of mammalian c-Rel as a template (PDB accession no. 1gji; Huang et al. 2001) (see Methods). Figure 1A
These results suggest that DNA-binding specificity data obtained for Dorsal and Relish proteins are likely to be directly applicable to the other species studied. While the majority of DNA-binding residues are also conserved within orthologs of Dif, the sequence variation that does occur suggests that more caution needs to be applied when making inferences between species for this gene. Rel homodimers have different DNA-binding preferences Alignment of the RHD domains of the D. melanogaster Dorsal, Dif, and Relish paralogs (Fig. 1B,C To examine the functional effect of the observed differences on protein–DNA recognition, we profiled DNA-binding specificities of Dorsal, Dif, and Relish homodimers to thousands of DNA variants. Oligonucleotide duplexes corresponding to 182 variants of the minimal spanning set uniformly covering the extended NFKB/Rel GGRDNNHHBS consensus, derived from the published examples of binding for mammalian NFKB and insect Rel proteins, were spotted in quadruplicate onto Codelink slides. Binding of each dimer to DNA sequences was monitored in three independent experiments. When experimental binding affinities were compared between the three proteins, we found that Dorsal had overlapping binding specificities to both Dif (correlation coefficient 0.70) and Relish (0.73), despite the limited degree of similarity between Dif and Relish binding specificities (0.29). The observed differences in DNA-binding preferences of Dif and Relish (summarized as binding sequence logos in Supplemental Fig. S1) are likely to contribute into preferential activation of specific antimicrobial peptide genes by one TF or another, consistent with the results of previously published SELEX-based analysis (Senger et al. 2004). To extrapolate the binding affinity predictions to all 5184 variants of the GGRDNNHHBS consensus we employed the PC statistical analysis, which considers variant DNA sequences as points in a high-dimensional Euclidian space, with coordinates that reflect on the sequence composition. The binding affinity of a TF to different DNA sequences is then modeled as a function of these coordinates. The model incorporates the effects of interactions between base pair positions in the binding site and it is sensitive to subtle differences in binding specificities of homologous TFs (Udalova et al. 2002). The 15 largest PCs were used to explain the variance of the GGRDNNHHBS space, of which 10 had significant coefficients (P-value < 0.05) for Dorsal, seven for Dif, and 11 for Relish (Supplemental Table S1). The 5184 sequence variants were ranked from 0 to 1 based on their predicted binding affinity to corresponding Rel proteins. The PC predictions were sufficiently accurate over the wide range of binding affinities and explained ~75% of total binding variance (see Methods). To incorporate the recent SELEX data for Dorsal, Dif, and Relish (Senger et al. 2004), we generated an additional set of scores for GGRDNNHHBN by averaging the N = [C,G] scores for instances where N = [A,T], giving a set of 10,368 Extended Scored Binding Sites (ESBSs) (Supplemental Table S2). A high frequency of conserved Rel binding sites in the promoters of immune and developmental genes To map putative Rel binding sites on a genome-wide basis, we screened the aligned genomes of seven Drosophila species for the presence of binding motifs within the GGRDNNHHBN consensus. A total of 320,701 sites (excluding those that overlapped with protein-coding exons) were found within 2 kb of the start sites of predicted genes (as defined by Ensembl). A large percentage of these sites (61%, 195,293 sites) did not have a counterpart in other genomes, with the remainder aligning in at least one more species; 3335 sites or 1% of these sites aligned in all seven species. A list of these sites is presented in Supplemental Table S3, which shows that known Dorsal-regulated developmental genes (e.g., snail, twist, zen, etc.) represent <2% of all the genes with Dorsal binding sites conserved in all seven species in their promoters: (12 genes described in Papatsenko and Levine 2005 + 21 novel genes identified in Stathopoulos et al. 2002 + 16 novel genes identified in Biemar et al. 2006)/2639 total number of genes. Other Rel binding sites conserved in all seven species were identified in the promoters of genes involved in innate immunity (e.g., attacin D, cecropin C; De Gregorio et al. 2001, 2002), other cellular events (e.g., actin, zeelin, Ets21C, etc.), or whose functions have not been analyzed (e.g., CG7313, CG10555, CG33308, etc.). However, when we analyzed the location of Rel binding sites, we found that promoters of developmental and immune genes have significantly more sites than the genome average number of conserved Dorsal binding sites (Table 1). This trend was observed for the complete range of species (two to seven species). Our results support previously published studies in which clusters of Dorsal binding sites were used to identify putative target genes involved in the development of Drosophila embryo (Stathopoulos et al. 2002).
Strong functional constraints on Rel binding sites are detected at immune and developmental loci To examine the evolutionary constraints on putative Rel binding sites, we used three measures: (1) the sequence divergence of binding sites; (2) the Dorsal binding ranking (S) of the D. melanogaster site (SD_mel); (3) the range of Dorsal binding ranking among the seven species (Δs = Smax − Smin). Sequence divergence was taken as the number of historical nucleotide substitutions that occur within each evolving binding site, given the alignment and known species phylogeny (i.e., the maximum parsimony score; see Methods for details). The Dorsal binding ranking (SD_mel) was determined from the interpolated binding site data (see Methods), according to Supplemental Table S2. The range of binding ranking (Δs) was defined as the maximum difference in Dorsal binding affinities between the species variants of the site (e.g., 0.952–0.931 = 0.021 for site at −183 nt or 0.978–0.918 = 0.060 for site at −1724 nt of the snail promoter; see Fig. 2C
We noticed, however, that Rel binding sites in the promoters of known Dorsal-regulated developmental genes (defined as in Papatsenko and Levine 2005 and marked in red in Fig. 2A In addition, we found that the affinity of Rel proteins to the binding sites at both developmental and immune loci was increasing with site sequence conservation (Fig. 2B Static and dynamic components contribute to functional conservation For ~90% of sites the value of site sequence divergence did not exceed three independent substitutions since the common ancestor of the drosophilid species. Most of the binding affinity conservation was, thus, due to the conserved underlying sequence of the Rel binding site (static conservation). However, although even a single mutation can significantly alter binding affinity, there were four instances in which multiple sequence mutations did not substantially affect predicted binding to Rel proteins. We refer to this latter phenomenon as dynamic conservation and noted that it was mainly observed in the vicinity of genes involved in developmental processes. For example, the dynamically conserved Rel binding site at −1724 nt upstream of the snail gene is presented in Figure 2C In summary, a quantitative profiling of Rel protein–DNA interactions led to the detection of atypical examples of binding sites in which sequence of the site changes significantly, while the overall functional fitness is maintained. Systematic quantitative analysis of DNA binding affinities identifies new putative functional Rel binding sites of low and moderate affinity In order to assess the consistency of our results with prior analyses, we examined the enhancer regions of developmental genes described by Papatsenko and Levine (2005) using our scoring scheme and identified 370 putative Rel binding sites (PC sites), which included 136 sites that overlapped with the D. melanogaster sites identified by Papatsenko and Levine (2005) using a standard position-weight-matrix (PWM sites). The PWM sites consistently have a ranking score >0.9 in our classification (Supplemental Table S4), irrespective of the number of species in which the binding site is conserved. In contrast, our PC sites show an upward trend in ranking scores with increasing sequence conservation, reaching a plateau with five conserved species (Fig. 3
These results suggest that the scoring of PWM sites by Papatsenko and Levine (2005) was set at a relatively high threshold relative to our analysis, effectively excluding all putative Rel binding sites of low and moderate affinity to Dorsal. At the same time, some of our high-scoring Dorsal binding sequences were not scored by Papatsenko and Levine (2005), but nonetheless may be functional (as their sequences, on average, are more conserved than the genomic background; see PC–PWM data points in Fig. 3 Relish- and Dif-regulated immune genes may be under distinct functional constraints Overall, high-affinity Rel binding sites conserved in all seven species were considerably rarer in the vicinity of immune genes compared to developmental genes (Table 1; Fig. 2B
Discussion Previous studies demonstrated that many Rel binding sites in the promoters of Dorsal-regulated developmental genes are within evolutionarily conserved sequence blocks (Papatsenko and Levine 2005). Here we investigate parameters of Rel binding site conservation across the entire Drosophila genome and show that sites at immune and developmental loci are under strong functional constraints. Specifically, binding affinities of Rel proteins to these sites are maintained in some case by the conservation of the DNA binding site sequence, and in others to sites with significantly diverged sequences (Fig. 2 Scored Site Conservation (SSC) measured by the number of species that share a homologous scorable binding site at a particular aligned location shows a strong correlation with binding site strength for promoters of developmental and immune genes, a phenomenon that, to the best of our knowledge, has not been previously reported (Fig. 2B The static conservation of Rel binding sites, i.e., conservation of nucleotide sequence, is consistent with the idea that functionally important elements have a slower rate of base substitutions (Jukes and Kimura 1984). We analyzed genomes of seven diverged Drosophila species to maximize the discovery of Rel binding motifs in regions of the genome with varying evolutionary rates (Nobrega and Pennacchio 2004). Less than 40% of the putative Rel binding motifs mapped within 2 kb of predicted gene start sites align in at least two species, with only 1% of the sites aligning in all seven species. We found that the conserved Rel binding sites are more likely to be situated in the vicinity of developmental and immune genes. Of interest, when we analyzed Rel binding sites in the FlyReg database, the percentage of sites aligned in seven species increased 15-fold (six out of 42 putative Rel binding sites scored by us were conserved either statically or dynamically). This was consistent with the vast majority of the FlyReg sites located in the vicinity of developmental genes and our genome-wide observation. In addition, it further highlighted the relationship between the binding site functional properties and its conservation in evolution. The presence of multiple, high-affinity well-conserved sites may aid in identification of putative targets of Dorsal (Stathopoulos et al. 2002). For instance, the wnt8 gene has five putative Rel binding sites conserved in seven species, with a binding ranking above a 0.8 cutoff (Supplemental Fig. S3). The wnt8 protein binds to a family of frizzled seven-transmembrane receptors and acts through a cascade of genes on the nucleus. WNT8 (WNTD) has recently been described as a feedback inhibitor of Dorsal in development and immunity, but the molecular mechanisms involved in the activation of this gene by Dorsal are not understood (Ganguly et al. 2005; Gordon et al. 2005). Taken together with the cluster of Rel binding sites, this suggests wnt8 could be a direct target of Dorsal. Another interesting candidate gene is schnurri, with four Rel binding sites conserved in seven species. The expression of schnurri is restricted to the dorsal ectoderm and the ventral mesoderm. Schnurri is believed to act as a repressor of the ind gene and possibly other pan-neurotic genes, which are not active in the ventral mesoderm and dorsal ectoderm (Stathopoulos and Levine 2005). Dynamic conservation, where the DNA sequence of the binding motif mutates without significant effect on its binding affinity, is likely to relate to the “fuzziness” of the binding sites due to the permissiveness of transcription factor–DNA interactions. Although the plasticity of DNA binding sites is often emphasized, and may have entropic or selective advantages in evolution (Gerland and Hwa 2002), we detected only a few examples of dynamic conservation of Rel binding sites (Fig. 2C The Rel proteins evolve with different mutation rates. Relish and Dif have a four- and 10-fold higher number of diverging amino acids than Dorsal, respectively (Fig. 1 The studies reported here further our understanding of the genome organization and functional conservation of transcription factor binding sites. They also highlight the importance of quantitative approaches to analyzing the genome regulatory code, as they provide a more sensitive tool for the annotation of putative binding sites and discerning structure-function relationships, i.e., the correlation between the binding affinity and DNA sequence conservation. Methods Identifying orthologs of Dorsal, Dif, and Relish and phylogenetic analysis We used the Drosophila melanogaster protein sequences of Dorsal, Dif, and Relish to search the genomes of other drosophilids, downloaded from the UCSC genome Web site (http://genome.ucsc.edu) using TBLASTN. We then isolated top matching regions and used GeneWise (Birney et al. 2004) to obtain protein predictions for each species. Other RHD-containing sequences were identified via protein BLAST searches of the NCBI database (Altschul et al. 1997). We confirmed orthology relationships via phylogenetic analysis of the aligned Rel Homology Domains (defined as the region covered by the PFAM RHD [Finn et al. 2006] and SMART IPT [Letunic et al. 2006]) using PHYML (Guindon et al. 2005) Protein modeling Multiple sequence alignments as well as homology models of Dif, Relish, and Dorsal were built in Internal Coordinate Mechanics (ICM) using homology modeling based on the available structure of mammalian c-Rel (PDB 1gji) as a template. Briefly, the sequence-structure alignment was done using the ICM alignSS algorithm that optimizes the sequence-structure match using residue accessibilities, secondary structures, and functional sites of the template and sequence. Loop predictions and model refinement were done using local global energy optimization strategy (http://www.molsoft.com). Protein–DNA binding assay and PC model RHD domains of Rel proteins of D. melanogaster origin (Dorsal: aa 16–384; Dif: aa 17–378; Relish: aa 117–434) were cloned into pET21d vector (Novagen) and their sequences were verified by DNA sequencing. The proteins were expressed in BL21(DE3) bacterial cells. The proteins were purified by DNA-binding affinity chromatography using biotin-labeled oligoduplexes comprising either NFKB site GGGGGATTCC or GGGAATTTCC essentially as described in Udalova et al. (1998). Oligonucleotide duplexes corresponding to 182 variants of the minimal spanning subset of motifs that uniformly covers the GGRDNNHHBS consensus binding space were spotted in quadruplicate onto Codelink slides as previously described (Linnell et al. 2004). Three independent binding experiments were performed for each Rel protein. The protein–DNA binding was detected with anti-HIS antibodies (H-15, sc-308, Santa Cruz Biotechnology, Inc.) followed by the secondary Cy5-conjugated anti-rabbit IgG antibodies (Jackson Immunoresearch Laboratories). Slides were scanned using an Axon 4000B scanner (Axon Instruments, Inc.) and analyzed with GenePix 4.1. Protein binding signal was normalized against DNA concentration in the corresponding spot, using Sybr Green (Molecular Probes), according to the manufacturer's instructions. The average of four binding values per slide was ascribed to each sequence variant. Binding signals were normalized within each slide, compared to the fluorescent readings for the GGGGTTC CCC motif, which was given a value of 1000. Affinity weighted sequence logos were generated by producing a sequence list containing as many copies of each representative sequence as the normalized experimental binding value for that sequence scaled by 1/100. These sequence lists were then used as input for the WebLogo program (version 2.8.2) (Crooks et al. 2004) (available at http://weblogo.berkeley.edu/logo.cgi). Extrapolation of the binding affinity predictions to all DNA motifs was achieved by fitting Principal Coordinate models to experimental data, essentially as described in Udalova et al. (2002). The PC model was fitted to the logarithms of three replicated measurements for Rel binding and included extra terms to account for between-microarrays effects. The 15 largest PCs were used to explain the variance of the GGRDNNHHBS space. In the regression out of 15 PCs, 10 had significant coefficients (P-value < 0.05) for Dorsal, seven for Dif, and 11 for Relish (Supplemental Table 1). The correlation coefficients between the affinities predicted by the PC model and experimentally observed affinities were 0.53 for Dif, 0.53 for Dorsal, and 0.55 for Relish and explained 74%, 74%, and 75% of total binding variance, respectively. Binding affinity predictions were extrapolated to all 5184 variants of the consensus and the sites were ranked. We further generated a set of scores for GGRDNNHHBN by averaging the N = [C,G] scores for instances where N = [A,T], giving a set of 10,368 Extended Scored Binding Sites (ESBSs) (Supplemental Table S2). Genome analysis We identified all instances of ESBSs in the D. melanogaster genome. We then mapped these onto alignments of drosophilid genomes (D. melanogaster, D. simulans, D. yakuba, D. ananassae, D. virilis, D. mojavensis, D. pseudoobscura) taken from the UCSC Web server (http://genome.ucsc.edu/multiz9way “maf” files), and assigned each aligned sequence the D. melanogaster score for that particular 10mer. Species in which the 10mer was not present in the ESBS set were counted as nonconserved. The range of binding site strengths was calculated by subtracting the lowest binding ranking from the highest for each scoring binding site. Sequence variability at a given binding site was calculated using the baseml program from the PAML software package—the “maximum parsimony score” from the mlb output file (http://abacus.gene.ucl.ac.uk/software/paml.html). RHD binding sites that were within 2 kb of the start of an Ensembl gene (http://www.ensembl.org) and that did not overlap with protein coding exons were analyzed. Ensembl genes were partitioned into three sets: (1) those used by Papatsenko and colleagues as known targets of Dorsal in dorsoventral patterning (Papatsenko and Levine 2005); (2) those identified by De Gregorio and colleagues as being involved in the immune response (De Gregorio et al. 2001); (3) all other genes not included in either of sets 1 or 2. Overlapping binding sites were treated as distinct, e.g., a 12mer could potentially contain three overlapping Dorsal binding sites. Taking only the highest scoring binding site in overlapping sets does not significantly alter the results. In the analysis of enhancer regions of developmental genes, 83 D. melanogaster PWM sites defined by Papatsenko and Levine (2005) were overlapped by 128 PC sites. Acknowledgments We thank Dr D. Papatsenko (UC Berkeley) for providing us with the sequences of Drosophila enhancers involved in dorsoventral patterning, Dr. W. Valdar and Professor R. Mott (WTCHG) for advice on statistics, and Professor C. Ponting (MRC-FGU Oxford) for helpful comments. We also thank the Berkeley Drosophila Genome Project, Agencourt Bioscience Corporation, Genome Sequencing Center WUSTL School of Medicine, Human Genome Sequencing Center at Baylor College of Medicine, and the Flybase Consortium for providing access to the assemblies of Drosophila genome sequences. I.A.U. is supported by the MRC New Investigator Award and the ARC. R.R.C. and J.R. are supported by the Wellcome Trust. Footnotes [Supplemental material is available online at www.genome.org.] Article is online at http://www.genome.org/cgi/doi/10.1101/gr.6490707 References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Curr Opin Genet Dev. 2002 Dec; 12(6):634-9.
[Curr Opin Genet Dev. 2002]Nature. 2005 Mar 17; 434(7031):338-45.
[Nature. 2005]Science. 1991 Nov 29; 254(5036):1385-7.
[Science. 1991]Development. 1998 Mar; 125(5):949-58.
[Development. 1998]Nature. 2000 Feb 3; 403(6769):564-7.
[Nature. 2000]Proc Natl Acad Sci U S A. 2001 Jun 19; 98(13):7158-63.
[Proc Natl Acad Sci U S A. 2001]Nucleic Acids Res. 2004 Feb 27; 32(4):e44.
[Nucleic Acids Res. 2004]Nat Genet. 2004 Dec; 36(12):1331-9.
[Nat Genet. 2004]Proc Natl Acad Sci U S A. 2002 Jun 11; 99(12):8167-72.
[Proc Natl Acad Sci U S A. 2002]Bioessays. 2002 May; 24(5):466-75.
[Bioessays. 2002]Genes Dev. 2001 Sep 15; 15(18):2321-42.
[Genes Dev. 2001]Dev Biol. 2002 Jun 1; 246(1):57-67.
[Dev Biol. 2002]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2006 Aug 22; 103(34):12763-8.
[Proc Natl Acad Sci U S A. 2006]Science. 1999 May 21; 284(5418):1313-8.
[Science. 1999]Genes Dev. 2001 Sep 15; 15(18):2321-42.
[Genes Dev. 2001]Nature. 2003 Nov 6; 426(6962):33-8.
[Nature. 2003]Proc Natl Acad Sci U S A. 2002 Jun 11; 99(12):8167-72.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D247-51.
[Nucleic Acids Res. 2006]Structure. 1999 Jul 15; 7(7):841-52.
[Structure. 1999]Structure. 2001 Aug; 9(8):669-78.
[Structure. 2001]Mol Cell. 2004 Jan 16; 13(1):19-32.
[Mol Cell. 2004]Proc Natl Acad Sci U S A. 2002 Jun 11; 99(12):8167-72.
[Proc Natl Acad Sci U S A. 2002]Mol Cell. 2004 Jan 16; 13(1):19-32.
[Mol Cell. 2004]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Dev Biol. 2002 Jun 1; 246(1):57-67.
[Dev Biol. 2002]Proc Natl Acad Sci U S A. 2006 Aug 22; 103(34):12763-8.
[Proc Natl Acad Sci U S A. 2006]Proc Natl Acad Sci U S A. 2001 Oct 23; 98(22):12590-5.
[Proc Natl Acad Sci U S A. 2001]EMBO J. 2002 Jun 3; 21(11):2568-79.
[EMBO J. 2002]Dev Biol. 2002 Jun 1; 246(1):57-67.
[Dev Biol. 2002]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Bioinformatics. 2005 Apr 15; 21(8):1747-9.
[Bioinformatics. 2005]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2001 Oct 23; 98(22):12590-5.
[Proc Natl Acad Sci U S A. 2001]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Dev Biol. 2002 Jun 1; 246(1):57-67.
[Dev Biol. 2002]PLoS Biol. 2005 Apr; 3(4):e93.
[PLoS Biol. 2005]J Mol Evol. 1984; 21(1):90-2.
[J Mol Evol. 1984]J Physiol. 2004 Jan 1; 554(Pt 1):31-9.
[J Physiol. 2004]Dev Biol. 2002 Jun 1; 246(1):57-67.
[Dev Biol. 2002]Development. 2005 Aug; 132(15):3419-29.
[Development. 2005]Nature. 2005 Sep 29; 437(7059):746-9.
[Nature. 2005]Dev Biol. 2005 Apr 15; 280(2):482-93.
[Dev Biol. 2005]J Mol Evol. 2002 Oct; 55(4):386-400.
[J Mol Evol. 2002]Nature. 2005 Feb 3; 433(7025):481-7.
[Nature. 2005]Nature. 2006 Apr 20; 440(7087):1050-3.
[Nature. 2006]PLoS Biol. 2005 Apr; 3(4):e93.
[PLoS Biol. 2005]Genome Res. 2004 May; 14(5):988-95.
[Genome Res. 2004]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D247-51.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D257-60.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2005 Jul 1; 33(Web Server issue):W557-9.
[Nucleic Acids Res. 2005]J Biol Chem. 1998 Aug 14; 273(33):21178-86.
[J Biol Chem. 1998]Nucleic Acids Res. 2004 Feb 27; 32(4):e44.
[Nucleic Acids Res. 2004]Genome Res. 2004 Jun; 14(6):1188-90.
[Genome Res. 2004]Proc Natl Acad Sci U S A. 2002 Jun 11; 99(12):8167-72.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2001 Oct 23; 98(22):12590-5.
[Proc Natl Acad Sci U S A. 2001]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4966-71.
[Proc Natl Acad Sci U S A. 2005]