![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||||||||||
Copyright : © 2007 Holohan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. CTCF Genomic Binding Sites in Drosophila and the Organisation of the Bithorax Complex 1 Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom 2 Institute for Genetics, Justus-Liebig-University Giessen, Giessen, Germany 3 Department of Genetics, University of Cambridge, Cambridge, United Kingdom Greg Gibson, Editor North Carolina State University, United States of America * To whom correspondence should be addressed. E-mail: rw108/at/cam.ac.uk Received March 26, 2007; Accepted May 21, 2007. This article has been cited by other articles in PMC.Abstract Insulator or enhancer-blocking elements are proposed to play an important role in the regulation of transcription by preventing inappropriate enhancer/promoter interaction. The zinc-finger protein CTCF is well studied in vertebrates as an enhancer blocking factor, but Drosophila CTCF has only been characterised recently. To date only one endogenous binding location for CTCF has been identified in the Drosophila genome, the Fab-8 insulator in the Abdominal-B locus in the Bithorax complex (BX-C). We carried out chromatin immunopurification coupled with genomic microarray analysis to identify CTCF binding sites within representative regions of the Drosophila genome, including the 3-Mb Adh region, the BX-C, and the Antennapedia complex. Location of in vivo CTCF binding within these regions enabled us to construct a robust CTCF binding-site consensus sequence. CTCF binding sites identified in the BX-C map precisely to the known insulator elements Mcp, Fab-6, and Fab-8. Other CTCF binding sites correlate with boundaries of regulatory domains allowing us to locate three additional presumptive insulator elements; “Fab-2,” “Fab-3,” and “Fab-4.” With the exception of Fab-7, our data indicate that CTCF is directly associated with all known or predicted insulators in the BX-C, suggesting that the functioning of these insulators involves a common CTCF-dependent mechanism. Comparison of the locations of the CTCF sites with characterised Polycomb target sites and histone modification provides support for the domain model of BX-C regulation. Author Summary There is still much to learn about the organisation of regulatory elements that control where, when, and how much individual genes in the genome are transcribed. Several types of regulatory element have been identified; some, such as enhancers, act over large genomic distances. This creates a problem: how do such long-range elements only regulate their appropriate target genes? Insulator elements have been proposed to act as barriers within the genome, confining the effects of long-range regulatory elements. Here we have mapped the locations of one insulator-binding protein, CTCF, in several regions of the Drosophila genome. In particular, we have focussed on the Hox gene cluster in the Bithorax complex; a region whose regulation has been extensively characterised. Previous investigations have identified independent regulatory domains that control the expression of Bithorax complex genes in different segments of the fly, however the molecular nature of the domain boundaries is unclear. Our major result is that we find CTCF binding sites precisely located at the boundaries of these regulatory domains, giving a common molecular basis for these boundaries. This provides a clear example of the link between the positioning of insulators and the organisation of gene regulation in the Drosophila genome. Introduction Insulator elements are DNA sequences that regulate interactions between promoters and enhancers. By preventing inappropriate enhancer/promoter communication, insulators are believed to play a key role in the genomic organisation of transcriptional regulation. Their mode of action is still unclear but may involve the formation of chromatin loops that partition the genome into separate regulatory domains [1–5]. In vertebrates, almost all characterised insulator elements are associated with the binding of CTCF, a DNA-binding protein that contains multiple zinc fingers. Although CTCF was initially identified as both a transcriptional activator and repressor [6–8], it was subsequently recognised as being essential for the enhancer blocking activity of several vertebrate insulators [9]. CTCF also functions in imprinting [10,11] and has been implicated in human disease [12]. Recently, Drosophila CTCF has been identified [13], joining other known Drosophila enhancer blocking proteins such as Su(Hw) [14], Zw5, and BEAF32 [15,16]. In addition to insulation of entire genes or groups of genes, insulators may also flank individual enhancers allowing them to act independently, facilitating complex tissue and cell-specific patterns of gene expression [17]. This function is particularly relevant in the case of the Hox genes, whose complex expression patterns specify segmental identities along the body axis. In Drosophila, correct antero-posterior patterning in the thorax and abdomen is dependent on the precise expression of the Hox genes of the Bithorax complex (BX-C) in specific parasegments [18,19]. This is achieved by the subdivision of the regulatory regions of each of the three BX-C genes (Ultrabithorax [Ubx], abdominal-A [abd-A], and Abdominal-B [Abd-B]) into distinct enhancer domains [20]. There are at least nine distinct regulatory regions, each important for specifying homeotic gene expression in individual thoracic and abdominal parasegments (PS) from PS 5–13 [21–25]. The domain hypothesis of Mihaly et al. [26] proposes that each distinct regulatory region or domain contains a modular arrangement of functional elements required for Hox gene expression in a particular parasegment. These elements include initiator, enhancer, and memory elements/Polycomb-response elements (PREs). It is thought that boundary elements, located between adjacent domains, restrict the influence of each regulatory region. The evidence for this comes from mutations that disrupt boundary function and from enhancer trap transposon studies, which have generated a map of the BX-C compartmentalised into distinct parasegmental regulatory regions [27,28]. Three boundaries Mcp, Fab-7, and Fab-8 have been defined by mutation [29–33]. Another, Fab-6, has been mapped genetically [26], and others are postulated to exist. Each of the three BX-C boundaries identified by mutational analysis display insulator function; i.e., they are capable of suppressing reporter gene expression when placed between an enhancer and a promoter in a transgenic insulator assay [4,29,34–36]. Recently, Moon et al. [13] showed that the Fab-8 boundary element contains binding sites for CTCF and that mutation of these sites greatly reduces the ability of Fab-8 to suppress reporter gene expression in an insulator assay, demonstrating that the insulating activity of Fab-8 is dependent on CTCF. Here we use chromatin immunopurification together with genomic microarray (ChIP-array) to investigate in vivo CTCF binding in several regions of the Drosophila genome, including the BX-C. From this analysis, we identify a CTCF binding-site consensus that allows the precise location of CTCF binding sites in these genomic regions. In the BX-C, in addition to the characterised CTCF sites in the Fab-8 boundary element, we demonstrate the presence of CTCF binding sites in the Mcp and Fab-6 boundaries. Furthermore, we identified CTCF binding sites between the regulatory regions bxd/pbx and iab-2, between iab-2 and iab-3, and between iab-3 and iab-4, providing both a localisation of the previously postulated boundary regions of “Fab-2,” “Fab-3,” and “Fab-4” and a demonstration that these too bind CTCF. A number of CTCF binding sites have been identified in the vertebrate genome, but there is little agreement as to the similarity between these sites at the DNA level [7,13,37]. Binding data have been interpreted to suggest that different combinations of zinc fingers are used to bind to differing sites of approximately 50 bp [7,38]. In contrast, our analysis in Drosophila indicates that CTCF sites contain a conserved consensus binding sequence of approximately 20 bp in length. Examination of the vertebrate CTCF binding sites reveals that they too contain a consensus sequence and that this vertebrate CTCF consensus is similar to the Drosophila site identified here. Results Identification of In Vivo CTCF Binding Locations In order to identify the in vivo binding sites of the Drosophila CTCF protein, we used our previously described ChIP-array procedure [39]. Sonicated chromatin, isolated from Drosophila embryos, was immunopurified using either anti-CTCF antiserum (specific immunopurification [IP]) or normal rabbit serum (control IP). The immunopurified DNA preparations were labelled with either Cy3 or Cy5 and hybridised to a 1-kb tiling-path genomic microarray covering the 3-Mb Adh region together with other selected genomic regions including the BX-C, the Antennapedia complex (ANT-C), and the achaete-scute region. As a positive control, the immunopurification reactions were assessed using specific PCR primers to amplify a 378-bp fragment from the Fab-8 region, containing characterised CTCF binding sites [13]. This fragment showed clear enrichment when compared with amplification using primers for a 300-bp fragment (Clone 10) that does not contain a CTCF binding site (unpublished data). Replicated hybridisation to genomic DNA tiling arrays generated a dataset (Dataset S1) with mean enrichments (Mn) equivalent to log2 3.8 (14-fold) observed. The Fab-8 positive control is represented on the array as fragment UBX65, which gave an enrichment value of 1.56 (3-fold) and good reproducibility (p = 0.0045 across four biological replicates). Fragments showing Mn > 0.45 (1.4-fold) and p < 0.05 were selected as potential CTCF binding sites (Figure 1
Identification of a CTCF Binding Consensus To identify potential CTCF binding sites within enriched fragments, the 33 candidate fragments were submitted to a motif discovery tool, Multiple Em for Motif Elicitation (MEME), to search for overrepresented sequence motifs [40]. The top motif found by MEME (e = 1.3 × 10−20) (Figure 1 Correlation between CTCF Binding Consensus and CTCF Binding The 30 occurrences of the MEME motif were used to construct a position-specific weight matrix that was in turn used as the input for the Patser profile-matching tool to search for matches within the genomic sequences on the microarray. The association between Patser matches and CTCF binding is demonstrated in Figure 1 Another way to examine the functional relevance of these predicted sites is to look at their conservation across species. Figure 1 Taken together, these data support the idea that the binding sites for CTCF in Drosophila can be described by a single weight matrix approximately 20 bp in length. This is clearly at odds with the notion, derived from studies of CTCF DNA binding in vertebrates, that CTCF binds to 50-bp target sites with a diverse spectrum of sequences [7,38]. CTCF Sites in the Bithorax Complex The ChIP-array analysis identifies eight locations with CTCF binding within the BX-C. As shown in Figure 2
We performed ChIP experiments with crosslinked chromatin from both Drosophila S2 cells and embryos to validate this set of CTCF sites (Figure 3
Since the nine CTCF binding regions show both ChIP enrichment and high-scoring Patser matches, they are likely to be direct CTCF targets rather than products of indirect association through, for example, chromatin looping. To substantiate this we analysed DNA binding in vitro by electrophoretic mobility shift assay (EMSA) with radioactively labelled probes and bacterially expressed purified GST-CTCF fusion proteins (Figure 3 As shown in Figure 2 The remaining mapped boundary, Fab-7, shows neither significant CTCF binding in the ChIP-array analysis nor a Patser site p < 10−13. However, using the more sensitive PCR assay of ChIP enrichment, we do observe a relatively weak but significant association of CTCF with Fab-7 (Figure S1). Given the strong connection of CTCF sites to mapped boundary elements, we investigated whether the other CTCF sites within the BX-C also identified boundaries. The positions of boundary elements can be estimated from the mapping of mutations that affect the individual parasegment-specific regulatory elements, and the extents of these cis-regulatory domains (taken from Maeda and Karch [28]) are indicated by the coloured bar in Figure 4
Of the remaining three CTCF binding sites with the BX-C (sites A–C, Figure 4 In summary, the CTCF sites identified here correlate with six out of seven known or postulated boundary elements, the only exception being Fab-7. As CTCF has been demonstrated to be required for insulator function at Fab-8 [13], it is likely that all these CTCF-associated boundaries function through a common CTCF-dependent mechanism. Genomic Context of CTCF Sites in the Bithorax Complex According to the domain model of BX-C regulation, the domains bounded by insulators would act as autonomous units that could either be active or silenced depending on the state of memory elements/PREs within each domain [26]. This is likely to require a precise arrangement of insulators and PREs to restrict PRE-dependent chromatin modification to specific domains. Several PREs have been mapped within the BX-C and, in particular, PREs have been located close to the boundary elements Mcp, Fab-7, and Fab-8 [29,33,41]. We were interested in examining the relationships between CTCF sites, the binding sites for Polycomb complexes, and the domains of chromatin modification. For this analysis we compared our CTCF ChIP-array data with a genome-wide analysis of Polycomb targets in Drosophila that determined the genomic binding sites for Polycomb Repressive Complex 1 (PRC1) complex components (Pc and Psc), for the Polycomb Repressive Complex 2 (PRC2) complex component E(Z), and for the PRC2-dependent chromatin modification, trimethylation of histone H3 lysine 27 (H3K27me3), in S2 cells [42]. In this particular cell line the Abd-B gene is expressed; the four downstream Abd-B promoters are active, but the most upstream promoter (Abd-B-RE) is silenced. Schwartz et al. [42] found that the Fab-7 and Fab-8 PREs are not bound by Polycomb complexes, and the Abd-B transcription unit is largely within an “open” domain devoid of H3K27me3 histone modification. In Figure 5
The H3K27me3 profile also shows a relationship to the location of CTCF sites. The most prominent feature of the H3K27me3 profile in S2 cells is the domain between approximately 12,725,000 and 12,795,000, which lacks the repressive trimethylation of lysine 27 (K27me3) modification. The right-hand side of this domain has a sharp border that corresponds well with the CTCF site “C” at 12,795,406. The left-hand side of the domain does not have a clear border and does not fit with a CTCF site. It is tempting to speculate that the differences in the two borders of the H3K27me3 domain may be related to the relative arrangement of the CTCF and Polycomb sites. On the left-hand side, the Polycomb site is “outside” the CTCF site, and the H3K27me3 modification spreads rightwards from the Polycomb site. On the right-hand side, the Polycomb site is “inside” the CTCF site, and the H3K27me3 modification does not spread past the CTCF site. We also note that the positions of the CTCF site/PREs at “Fab-4,” Mcp, and Fab-8 are associated with pronounced depressions in the K27me3 profile. This may be related to nucleosome depletion at PREs [43], but it is interesting that CTCF binding sites in the mouse ß-globin locus are also depleted for repressive chromatin marks [44]. We examined the conservation of the CTCF sites in the BX-C. The sites show high conservation with median PhastCons scores close to 1.0 across the approximately 20-bp motif. We illustrate this for an individual site Mcp (Figure 5 CTCF Binding in Other Genomic Regions Other genomic regions screened for CTCF binding sites on the microarray include the 3-Mb Adh region and the smaller Antennapedia and achaete-scute regions. The Adh region [45] is a well-characterised region of Chromosome 2L, containing approximately 250 genes from kuzbanian to cactus, which serves as a representative region of the fly genome. The ChIP-array identified 18 fragments in the Adh region with Mn > 0.45 (1.4-fold) and p < 0.05 (Figure 6
Identification of CTCF binding sites within the Adh region presented an opportunity to investigate the relationship between CTCF binding sites and annotated genome features. Given CTCF's well-documented insulating function, it seemed likely that most identified sites would be in intergenic regions and this proved to be the case. Of the 17 sites in the ADH region, 15 (88%) are present in intergenic regions. No sites overlap exons, but two sites present in ADH-705 overlap the 3′ UTR region of the protein kinase gene smell-impaired 35A (smi35A). Most sites occur as single isolated sites (65% are separated by at least 500 bp), but there are three pairs of sites that are closer than 200 bp apart. Thus, in general, the CTCF sites in this region are not present in multisite clusters, but there are some closely spaced pairs of sites. Neighbouring sites, in general, flank several transcription units (e.g., the sites flanking CyclinE [CycE] shown in Figure 6 We compared the location of CTCF sites with the sites we have identified for another Drosophila insulator-binding protein, Su(Hw) (B. Adryan, G. Woerfel, I. Birch-Machin, S. Gao, M. Quick, L. Meadows, S. Russell, and R. White; unpublished data). The Su(Hw) sites are illustrated in Figure 6 A total of three out of the four ChIP-array enriched fragments in the Antennapedia region displayed a match to the top motif discovered by MEME. The remaining fragment is a neighbouring fragment. All three directly enriched fragments contain at least one high-scoring Patser site (p < 10−12). In total, four sites are identified in the Antennapedia genomic region, and only one of the sites occurs in an intergenic region (ANT297). The remaining three sites are located within the first intron of Antennapedia itself. These sites consist of a pair of sites, 179 bp apart, and one “single” site. Only a single fragment was identified within the achaete-scute complex, this contains a high-scoring Patser site (p = 10−14.5) and is present in the intergenic region between scute (sc) and lethal of scute (l(1)sc). Vertebrate CTCF Binding Site Although the existence of a region of similarity within different vertebrate CTCF binding sites has been noted [9,49], a consensus binding site has not been universally recognised, mainly because of experiments that suggest that CTCF binds to DNA by employing varying combinations of different zinc fingers [7,50,51]. Following identification of the Drosophila CTCF consensus binding site, we examined the possibility that the vertebrate and Drosophila binding sites are similar in sequence. We utilised the selection of sites compiled by Moon et al. [13] and submitted these sequences to the Motif Discovery tool, MEME. The highest scoring motif identified (e = 6.6 × 10−11) was found in all 12 sequences and is similar both to the conserved region identified previously in footprinting experiments and also to the Drosophila CTCF binding site reported here (Figure 7
Discussion The multiple zinc-finger DNA-binding protein CTCF is known to be required for the enhancer blocking action of vertebrate insulators, and a clear role for CTCF in the regulation of endogenous gene expression has been demonstrated at the imprinted Igf2 locus [9–11]. The mode of action of CTCF is, however, still unclear, although several studies have implicated CTCF in the formation of higher-order chromatin structure. CTCF molecules can interact to form clusters and thereby may mediate the formation of chromatin loop domains [44,52–54]. Partitioning of regulatory elements into independent chromatin loop domains is postulated to play a key role in the interactions between enhancers and promoters. Recently, a CTCF homolog was identified in Drosophila, and it was discovered that CTCF is required for the insulator function of the Fab-8 element in the BX-C [13]. This observation opened up the prospect of utilising the wealth of genetic and molecular characterisation of BX-C transcriptional regulation for the analysis of CTCF function. Here we have used ChIP-array to investigate CTCF binding sites in regions of the Drosophila genome with a particular focus on the BX-C. We find that CTCF is not only associated with the Fab-8 insulator, but also with other mapped boundary elements, Fab-6 and Mcp. In addition, we show that CTCF sites are located at other postulated boundaries within the BX-C; “Fab-2,” “Fab-3,” and “Fab-4.” This provides a precise mapping of regulatory domain boundaries and a specific molecular foundation for the domain model of BX-C regulation. We note that the Fab-7 boundary may differ from the other characterised boundaries in the BX-C as we do not find a strong Patser match to the CTCF consensus in the functionally mapped Fab-7 boundary element. Although Fab-7 was not demonstrably enriched in the ChIP-array, we found significant CTCF association with Fab-7 in the more sensitive PCR-base ChIP assay. Given the lack of a strong Patser match this may suggest an indirect association. We also do not see a CTCF site between the abx/bx and the bxd/pbx regulatory elements. However, these elements are separated by a long distance, and it is not clear whether they require insulation. According to the domain model [26], the parasegment-specific regulatory domains that control the expression patterns of the Ubx, abd-A, and Abd-B genes of the BX-C are initially activated in appropriate parasegments by the early pattern-forming genes acting on initiator elements. Each regulatory domain is predicted to contain a particular initiator element, tuned to respond to a specific combination of gap and pair-rule gene products, thus activating the regulatory domain in the appropriate set of parasegments. This activation would be read by maintenance elements consisting of PREs that thereafter autonomously maintain each regulatory domain in either the OFF (silenced) or ON (active) state. Within a domain in the ON state, enhancers present in that domain would be able to engage with the relevant gene promoter and regulate expression of the gene. Boundary elements that flank each domain are proposed to restrict the effects of the initiator and maintenance elements to a single domain. Although boundary elements are postulated to have the common property of insulating the regulatory domains, no sequence similarity between the mapped boundary elements has been reported until now. Here we show that a set of these boundary elements contain CTCF binding sites and bind CTCF in vivo. CTCF has been shown to be required for the insulator activity of Fab-8, and it seems likely that CTCF will also be a required component at the other boundary elements. In support of this suggestion, we find that the CTCF sites are well conserved within the sequenced insect genomes. The observation that CTCF sites flank a set of regulatory domains in the BX-C, together with the vertebrate studies that suggest that CTCF can mediate the formation of chromatin loops [44,52] supports the idea that interaction between CTCF sites may organise these domains into chromatin loops. However, how such a looping mechanism enables the autonomy of the individual regulatory domains and facilitates appropriate enhancer/promoter interactions is still unclear. A key feature of the domain model is the relationship between the boundary and maintenance elements. For the domains to be capable of independently being set to the ON or OFF state, the range of influence of PREs needs to be restricted by the domain boundaries. Each domain would require at least one PRE. Our precise mapping of in vivo CTCF binding sites has enabled us to examine their relationship with Polycomb target sites. In strong support of the domain model, we find that the domains demarcated by CTCF sites contain Polycomb target sites. Indeed, we find an intimate relationship between CTCF and Polycomb binding sites as shown in Figure 5 The individual regulatory domains must not only be able to act autonomously to set and maintain their activity state, but they must also be able to interact appropriately with the relevant gene promoters. Boundaries may play a role in this, and recently Cleard et al. [63] have demonstrated a long-range interaction between Fab-7 and the Abd-B-RB promoter. This interaction was associated with lack of Abd-B expression, but similar interactions, bringing in appropriate enhancers, may also activate expression. The ability of CTCF to form clusters may facilitate such interactions, and it is intriguing that there are CTCF sites not only at the boundaries but also close to Abd-B promoters; the CTCF site “B” is 300 bp upstream of the Adb-B-RB promoter (Figure 4 We can compare this ChIP-array analysis of CTCF genomic sites with our ChIP-array analysis of binding sites for another Drosophila insulator-binding protein, Su(Hw) (B. Adryan, G. Woerfel, I. Birch-Machin, S. Gao, M. Quick, L. Meadows, S. Russell, and R. White; unpublished data). CTCF and Su(Hw) are both multi-zinc- finger DNA-binding proteins, and in both cases we have identified relatively long (~20 bp) consensus binding sites. In contrast to most DNA-binding proteins, we find that strength of match to the consensus binding sites is a good predictor of in vivo occupancy. We have also investigated whether our data indicate any collaboration between CTCF and Su(Hw). This seemed an attractive possibility since removing Su(Hw) function in vivo has little effect; su(Hw) null mutant flies are female-sterile but viable. Also, the insulating activity of Fab-8 was significantly reduced when the CTCF sites were mutated but not completely abolished [13]. However we found no evidence for general colocalisation between CTCF and Su(Hw). A total of 60 Su(Hw) sites were identified in the Adh region, and only one of the fragments covering this region contained both CTCF and Su(Hw) sites. The single CTCF site identified in the achaete-scute complex was also some distance from the two Su(Hw) sites we found. Subsequent ChIP-array analysis in the BX-C led to the identification of only one Su(Hw) site within the entire BX-C region, in a location devoid of CTCF binding sites (B. Adryan, S. Russell, and R. White unpublished data). Indeed whilst the BX-C appears relatively enriched in CTCF sites compared to the Adh region, the converse is true for Su(Hw). For CTCF there are 4.7 sites/100 kb in the BX-C and 1.7 sites/100 kb in the Adh region (using Patser p < 10−13), whereas for Su(Hw) the BX-C is depleted in sites with only 0.29/100 kb in comparison to 2.7/100 kb in the Adh region (using Patser p < 10−15). Clearly, although CTCF and Su(Hw) both possess insulating ability, their sites of action do not correlate and there is no evidence from our analysis, covering approximately 3% of the Drosophila genome, for cooperative activity. By comparing the sequences of ChIP-enriched fragments we identified a strong Drosophila consensus CTCF binding site. Analysis of vertebrate CTCF target sequences leads us to propose that vertebrate CTCF also binds to a similar consensus sequence. Our findings do not support the current view that CTCF binds to divergent DNA sequences by engaging different subsets of the zinc fingers [38,49,64]. Indeed, the binding site revealed here has been previously noted. Bell et al. [9] identified a CTCF binding site in the chicken β-globin insulator, and sequence comparisons between this site and other known CTCF sites [6–8] identified a conserved 3′ region, the mutation of which completely abolished CTCF binding and enhancer blocking. Filippova et al. [49] extended this comparison to include the Dm1 sites, mouse H19 DMD4 and DMD7 and human MYC A, and again identified a conserved region within the larger approximately 50-bp DNase footprint for each site. It is this conserved region that corresponds to the vertebrate CTCF site found here. Very recently, an analysis of CTCF binding in the human genome has generated a vertebrate CTCF consensus site [65], and a CTCF consensus has also been derived from analysis of conserved regions in the human genome [66]. Both these sites are very similar to the consensus we identify here; in particular they share the strong features of the CC at positions 1 and 2, the AG at positions 6 and 7, and the GGC at positions 10, 11, and 12. Overall, these findings indicate that CTCF in both Drosophila and vertebrates binds to a single core consensus sequence. In summary, ChIP-array analysis has enabled us to construct a CTCF binding site consensus. Mapping of genomic binding sites leads us to propose that all known or predicted insulators in the BX-C (with the possible exception of Fab-7) function in a CTCF dependent manner. Materials and Methods Fly strains and antibodies. The wild-type strain used was OregonR. The primary antibody used was rabbit anti-CTCF-C [13]. Chromatin isolation and immunopurification for microarray analysis. Chromatin from embryos aged between 0 to 20 h after egg laying was purified as described previously [39]. The 300-μl immunopurification reaction contained 1.0 μl of rabbit anti-CTCF antibody for the specific IP or 1 μl of normal rabbit antiserum for the control IP. ChIP enrichment was assayed using PCR with specific primers as described previously [39]. The primers used were to Fab-8 (UBX65), catcttccgttcatccgtttc and tgttggtgagcaagcgaaga, and Clone 10, attgggattctgcgattctg and tactgttcctggtgctggtg [13]. Validation ChIP assays for the CTCF sites in the BX-C were performed according to Moon et al. [13]. The validation ChIP primers are listed in Table S1. Microarray analysis. The arrays used consist of 4,213 PCR products most of which are approximately 1 kb in length. The regions covered by the PCR products include the 3-Mb Adh region from kuzbanian to cactus, the BX-C and ANT-C regions, and 130 kb of the achaete-scute complex. Amplification and labelling of DNA from enriched chromatin and hybridisations to genomic DNA tiling arrays were carried out as described previously [39]. We used four biological replicates (i.e., independent chromatin preparations), and each of these was hybridised as dye-swap technical replicates giving 16 array hybridisations in total. Microarray scanning, spot-finding, and normalisation were performed as described in Birch-Machin et al. [39] and on the FlyChip Web site (http://www.flychip.org.uk). The normalisation used VSN [67], which is based on an arsinh transformation and generates an enrichment measure that is generally equivalent to the log2 Cy3/Cy5 ratio. Statistical significance was assessed using the CyberT framework (http://visitor.ics.uci.edu/genex/cybert) [68]. Binding site analysis. The MEME version 3.0 Web site [40] was used to identify a consensus sequence. Parameters were set to discover up to six motifs between ten to 30 nucleotides in length. The consensus sequence for the CTCF binding motif was depicted using the MEME site stack in WebLogo (http://weblogo.berkeley.edu). The site stack for the CTCF binding motif was used to create a position-specific weight matrix (Table S2) for the Patser Web interface (http://rsat.ulb.ac.be/rsat/patser_form.cgi) [69]. This position-specific weight matrix was used to search DNA sequences present on the array and the Drosophila genome for matches to the consensus sequence using Release 4.0 coordinates. Patser generates a score for each position and provides a p-value; this is the probability of observing a particular score or higher at a particular sequence position. The Affymetrix Integrated Genome Browser (http://www.affymetrix.com/support/developer/tools/affytools.affx) was used to visualise the in vivo CTCF binding profile across the genome. Analysis of the evolutionary conservation of CTCF motifs used the PhastCons multiple alignment data available from the University of California Santa Cruz (Santa Cruz, California, United States) Genome Browser Web site using phastCons15way on D. melanogaster genome Release 4 (http://genome.ucsc.edu). EMSA. Radiolabelled DNA probes (150–250 bp) were generated by PCR with 32P-labelled oligonucleotide primers and prepared by subsequent gel purification. The probes were incubated with 0.2 μg of purified GST, GST-CTCF, or GST-CTCF ZF. Recombinant proteins were prepared as described previously [70]. The binding reaction was performed in PBS ([pH 7.4], supplemented with 5 mM MgCl2, 1 mM ZnCl2, 1 mM DTT, 0.1% NP-40, and 10% Glycerol) for 15 min at room temperature in the presence of 200 ng/μl pdIdC. Protein-DNA complexes were analysed on nondenaturing poylacrylamide gels (3.5% acrylamide [w/v]) in TAE-buffer. Electrophoresis was performed at 4 °C with a field strength of 12 V/cm for 3 h. Dataset S1: CTCF ChIP-Array Data Table shows the Array Spot ID, chromosome coordinates, Fragment ID, the values for the four biological replicate ratios, number of observations, Mn, standard deviation, t-value, and p-value derived by CyberT from the ChIP-array data. (483 KB TDS) Click here for additional data file.(483K, tds) Dataset S2: The 33 Candidate Enriched Fragments Table shows Fragment ID, start coordinate, stop coordinate, CyberT Mn, t-value, and p-value for the selected fragments with Mn > 0.45 and p < 0.05. (13 KB PDF) Click here for additional data file.(13K, pdf) Figure S1: ChIP Analysis of CTCF Binding at Fab-7 ChIP was performed with chromatin from Drosophila S2 cells as in Figure 3 (54 KB PDF) Click here for additional data file.(54K, pdf) Table S1: Primers for Validation ChIP Assays (29 KB DOC) Click here for additional data file.(29K, doc) Table S2: CTCF Position-Specific Weight Matrix (12 KB PDF) Click here for additional data file.(13K, pdf) Accession Numbers The Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo) accession number for the genomic tiling array is GEO Platform GPL5028 XC003 and for the ChIP data is series GSE7351. The Entrez Gene (http://www.ncbi.nlm.nih.gov) accession numbers of the genes discussed in this paper are: CTCF human, 10664 and Ctcf mouse, 13018. The Flybase (http://flybase.bio.indiana.edu) accession numbers of the genes and gene products discussed in this paper are: abdominal-A (abd-A), FBgn0000014; Abdominal-B (Abd-B), FBgn0000015; achaete (ac), FBgn0000022; Alcohol dehydrogenase (Adh), FBgn0000055; Antennapedia (Antp), FBgn0000095; BEAF32, FBgn0015602; cactus (cact), FBgn0000250; CTCF, FBgn0035769; Cyclin E (CycE), FBgn0010382; Enhancer of zeste (E(z)), FBgn0000629; kuzbanian (kuz), FBgn0015954; lethal of scute (l(1)sc), FBgn0002561; outspread (osp), FBgn0003016; Polycomb (Pc), FBgn0003042; Posterior sex combs (Psc), FBgn0005624; scute (sc), FBgn0004170; smell-impaired 35A (smi35A), FBgn0016930; suppressor of Hairy wing (su(Hw)), FBgn0003567; Su(var)3–9, FBgn0003600; Ultrabithorax (Ubx), FBgn0003944; and Zw5, FBgn0000520. Acknowledgments We thank the FlyChIP-array facility for excellent support, especially Bettina Fischer for advice on microarray analysis. We also thank Ian Birch-Machin for preparing chromatin samples and Leni Shäfer-Pfeiffer for technical assistance. Abbreviations
Footnotes ¤a Current address: Smurfit Institute of Genetics and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland ¤b Current address: Theoretical and Computational Biology Group, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom Author contributions. RR, SR, and RW conceived and designed the experiments. EEH, MB, and MH performed the experiments. EEH, CK, and RW analysed the data. BA contributed reagents/materials/analysis tools. EEH and RW wrote the paper. Funding. This work was funded by the United Kingdom Biotechnology and Biological Sciences Research Council and by the Deutsche Forschungsgemeinshaft. Competing interests. The authors have declared that no competing interests exist. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||||||||||
Genes Dev. 2003 Mar 1; 17(5):664-75.
[Genes Dev. 2003]Science. 2001 Jan 19; 291(5503):495-8.
[Science. 2001]Mol Cell Biol. 1997 Mar; 17(3):1281-8.
[Mol Cell Biol. 1997]J Biol Chem. 1997 Dec 26; 272(52):33353-9.
[J Biol Chem. 1997]Cell. 1999 Aug 6; 98(3):387-96.
[Cell. 1999]Nature. 2000 May 25; 405(6785):482-5.
[Nature. 2000]Nature. 2000 May 25; 405(6785):486-9.
[Nature. 2000]Heredity. 2005 Jun; 94(6):571-6.
[Heredity. 2005]Cell. 1985 Nov; 43(1):81-96.
[Cell. 1985]Nature. 1978 Dec 7; 276(5688):565-70.
[Nature. 1978]Proc Natl Acad Sci U S A. 1995 Aug 29; 92(18):8398-402.
[Proc Natl Acad Sci U S A. 1995]Nature. 1985 Feb 14-20; 313(6003):545-51.
[Nature. 1985]Mol Cell Biol. 1996 Jun; 16(6):2802-13.
[Mol Cell Biol. 1996]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Genome Res. 2004 Aug; 14(8):1594-602.
[Genome Res. 2004]Trends Genet. 2001 Sep; 17(9):520-7.
[Trends Genet. 2001]Genome Biol. 2005; 6(7):R63.
[Genome Biol. 2005]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Proc Int Conf Intell Syst Mol Biol. 1994; 2():28-36.
[Proc Int Conf Intell Syst Mol Biol. 1994]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Mol Cell Biol. 1996 Jun; 16(6):2802-13.
[Mol Cell Biol. 1996]Trends Genet. 2001 Sep; 17(9):520-7.
[Trends Genet. 2001]Proc Natl Acad Sci U S A. 1995 Aug 29; 92(18):8398-402.
[Proc Natl Acad Sci U S A. 1995]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Development. 2006 Apr; 133(8):1413-22.
[Development. 2006]Development. 2000 Sep; 127(18):3981-92.
[Development. 2000]Proc Natl Acad Sci U S A. 1995 Aug 29; 92(18):8398-402.
[Proc Natl Acad Sci U S A. 1995]Development. 2006 Apr; 133(8):1413-22.
[Development. 2006]Development. 2000 Sep; 127(18):3981-92.
[Development. 2000]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Development. 2006 Aug; 133(15):2983-93.
[Development. 2006]Development. 2000 Feb; 127(4):779-90.
[Development. 2000]Development. 1997 May; 124(9):1809-20.
[Development. 1997]Development. 2001 Jun; 128(11):2163-73.
[Development. 2001]Nat Genet. 2006 Jun; 38(6):700-5.
[Nat Genet. 2006]Nat Genet. 2006 Jun; 38(6):700-5.
[Nat Genet. 2006]Mol Cell Biol. 2005 May; 25(9):3682-9.
[Mol Cell Biol. 2005]Development. 2001 Jun; 128(11):2163-73.
[Development. 2001]Nat Genet. 2006 Jun; 38(6):700-5.
[Nat Genet. 2006]Mol Cell. 2006 Oct 6; 24(1):91-100.
[Mol Cell. 2006]Genes Dev. 2006 Sep 1; 20(17):2349-54.
[Genes Dev. 2006]Genetics. 1999 Sep; 153(1):179-219.
[Genetics. 1999]J Biol. 2002; 1(1):5.
[J Biol. 2002]Nature. 2002 Dec 12; 420(6916):666-9.
[Nature. 2002]J Biol. 2002; 1(1):5.
[J Biol. 2002]Cell. 1999 Aug 6; 98(3):387-96.
[Cell. 1999]Nat Genet. 2001 Aug; 28(4):335-43.
[Nat Genet. 2001]Mol Cell Biol. 1996 Jun; 16(6):2802-13.
[Mol Cell Biol. 1996]Curr Biol. 2000 Jul 13; 10(14):853-6.
[Curr Biol. 2000]Nucleic Acids Res. 2000 Sep 1; 28(17):3370-8.
[Nucleic Acids Res. 2000]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Cell. 1999 Aug 6; 98(3):387-96.
[Cell. 1999]Nature. 2000 May 25; 405(6785):486-9.
[Nature. 2000]Genes Dev. 2006 Sep 1; 20(17):2349-54.
[Genes Dev. 2006]Mol Cell. 2004 Jan 30; 13(2):291-8.
[Mol Cell. 2004]Proc Natl Acad Sci U S A. 2006 Jul 11; 103(28):10684-9.
[Proc Natl Acad Sci U S A. 2006]Development. 2006 Aug; 133(15):2983-93.
[Development. 2006]Genes Dev. 2006 Sep 1; 20(17):2349-54.
[Genes Dev. 2006]Mol Cell. 2004 Jan 30; 13(2):291-8.
[Mol Cell. 2004]Proc Natl Acad Sci U S A. 2002 May 14; 99(10):6883-8.
[Proc Natl Acad Sci U S A. 2002]Nucleic Acids Res. 2004; 32(16):4903-19.
[Nucleic Acids Res. 2004]Science. 2006 Feb 24; 311(5764):1118-23.
[Science. 2006]Nat Genet. 2006 Aug; 38(8):931-5.
[Nat Genet. 2006]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Trends Genet. 2001 Sep; 17(9):520-7.
[Trends Genet. 2001]Nat Genet. 2001 Aug; 28(4):335-43.
[Nat Genet. 2001]Nat Rev Genet. 2006 Sep; 7(9):703-13.
[Nat Rev Genet. 2006]Cell. 1999 Aug 6; 98(3):387-96.
[Cell. 1999]Mol Cell Biol. 1997 Mar; 17(3):1281-8.
[Mol Cell Biol. 1997]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Genome Biol. 2005; 6(7):R63.
[Genome Biol. 2005]EMBO Rep. 2005 Feb; 6(2):165-70.
[EMBO Rep. 2005]Genome Biol. 2005; 6(7):R63.
[Genome Biol. 2005]Bioinformatics. 2002; 18 Suppl 1():S96-104.
[Bioinformatics. 2002]Bioinformatics. 2001 Jun; 17(6):509-19.
[Bioinformatics. 2001]Proc Int Conf Intell Syst Mol Biol. 1994; 2():28-36.
[Proc Int Conf Intell Syst Mol Biol. 1994]Bioinformatics. 1999 Jul-Aug; 15(7-8):563-77.
[Bioinformatics. 1999]Mol Cell Biol. 2007 May; 27(10):3557-68.
[Mol Cell Biol. 2007]