![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2006, Cold Spring Harbor Laboratory Press Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements 1Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts 01605-0103, USA; 2NimbleGen Systems Inc., Madison, Wisconsin 53711, USA; 3Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141-2023, USA; 4Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts 02115-6110, USA; 5Program for Evolutionary Dynamics, Harvard University, Cambridge, Massachusetts 02138-3758, USA; 6Department of Radiation Oncology, University of Washington School of Medicine, Seattle, Washington 98104, USA 7Corresponding author. E-mail job.dekker/at/umassmed.edu; fax (508) 856-4650. Received June 2, 2006; Accepted July 25, 2006. This article has been cited by other articles in PMC.Abstract Physical interactions between genetic elements located throughout the genome play important roles in gene regulation and can be identified with the Chromosome Conformation Capture (3C) methodology. 3C converts physical chromatin interactions into specific ligation products, which are quantified individually by PCR. Here we present a high-throughput 3C approach, 3C-Carbon Copy (5C), that employs microarrays or quantitative DNA sequencing using 454-technology as detection methods. We applied 5C to analyze a 400-kb region containing the human β-globin locus and a 100-kb conserved gene desert region. We validated 5C by detection of several previously identified looping interactions in the β-globin locus. We also identified a new looping interaction in K562 cells between the β-globin Locus Control Region and the γ–β-globin intergenic region. Interestingly, this region has been implicated in the control of developmental globin gene switching. 5C should be widely applicable for large-scale mapping of cis- and trans- interaction networks of genomic elements and for the study of higher-order chromosome structure. Intense efforts are under way to map genes and regulatory elements throughout the human genome (ENCODE Project Consortium 2004). These studies are expected to identify many different types of elements, including those involved in gene regulation, DNA replication, and genome organization in general. Analysis of only 1% of the human genome has already revealed that genes are surrounded by a surprisingly large number of putative regulatory elements (data available at http://genome.cse.ucsc.edu/encode/). In order to fully annotate the human genome and to understand its regulation, it is important to map all genes and functional elements and also to determine all relationships between them. For instance, all regulatory elements of each gene must be identified. This endeavor is complicated by the fact that the genomic positions of genes and elements do not provide direct information about functional relationships between them. A well-known example is provided by enhancers that can regulate multiple target genes that are located at large genomic distances or even on different chromosomes without affecting genes immediately next to them (Spilianakis et al. 2005; West and Fraser 2005). Recent evidence indicates that regulatory elements can act over large genomic distances by engaging in direct physical interactions with their target genes or with other elements (Dekker 2003; de Laat and Grosveld 2003; Chambeyron and Bickmore 2004; West and Fraser 2005). These observations indicate that the genome may be organized as a complex three-dimensional network that is determined by physical interactions between genes and elements. Therefore, we hypothesize that functional relationships between genes and regulatory elements can be determined by analyzing this network through mapping of chromatin interactions. Physical interactions between elements can be detected with the Chromosome Conformation Capture (3C) method (Dekker et al. 2002; Dekker 2003; Splinter et al. 2004; Miele et al. 2006). 3C uses formaldehyde cross-linking to covalently trap interacting chromatin segments throughout the genome. Interacting elements are then restriction-enzyme-digested and intramolecularly ligated (Fig. 1A
3C was initially used to study the spatial organization of yeast chromosome III (Dekker et al. 2002) and has since been applied to the analysis of several mammalian loci such as the β-globin locus (Tolhuis et al. 2002; Palstra et al. 2003; Vakoc et al. 2005), the T-helper type 2 cytokine locus (Spilianakis and Flavell 2004), the immunoglobulin κ locus (Liu and Garrard 2005), and the Igf2 imprinted locus (Murrell et al. 2004). These studies revealed direct interactions between enhancers and promoters of target genes, with the linking DNA looping outward. 3C was also used to detect trans interactions between yeast chromosomes (Dekker et al. 2002) and between functionally related elements located on different mouse chromosomes (Spilianakis et al. 2005; Ling et al. 2006; Xu et al. 2006). Together, these studies suggest that long-range cis and trans interactions play widespread roles in the regulation of the genome and that 3C is a convenient approach to map this network of interactions. 3C uses PCR to detect individual chromatin interactions, which is particularly suited for relatively small-scale studies focused on the analysis of interactions between a set of candidate elements. However, PCR detection is not conducive to ab initio and large-scale mapping of chromatin interactions. To overcome this problem, 3C libraries need to be analyzed using a high-throughput detection method such as microarrays or DNA sequencing. The extreme complexity of the 3C library and the low relative abundance of each specific ligation product make direct large-scale analysis difficult. Here we present a novel 3C-based methodology for large-scale parallel detection of chromatin interactions. We refer to this method as 3C-Carbon Copy, or “5C.” 5C uses highly multiplexed ligation-mediated amplification (LMA) to first “copy” and then amplify parts of the 3C library followed by detection on microarrays or by quantitative DNA sequencing. 5C was developed and validated by analyzing the human β-globin locus and a conserved gene desert region located on human chromosome 16. We find that 5C quantitatively detects several known DNA looping interactions. Interestingly, 5C analysis also identified a looping interaction between the β-globin Locus Control Region (LCR) and the γ–δ intergenic region. Previously, several lines of evidence have suggested that this region plays a role in regulating the developmentally controlled switching from γ-globin expression in fetal cells to β-globin expression in adult cells (Calzolari et al. 1999; Gribnau et al. 2000). 5C should be widely applicable to determine the cis and trans connectivity of regulatory elements throughout large genomic regions. In addition, 5C experiments can be designed so that complete interaction maps can be generated for any large genomic region of interest, which can reveal locations of novel gene regulatory elements and may also provide detailed insights into higher-order chromosome folding. Results The 3C method has been described in detail (Dekker et al. 2002; Splinter et al. 2004; Vakoc et al. 2005; Dekker 2006; Miele et al. 2006) and is illustrated in Figure 1A In a typical 3C analysis, individual interaction frequencies are determined by quantifying the formation of predicted “headto-head” ligation products using semiquantitative PCR (Fig. 1A Outline of the 5C technology We have developed 5C to detect ligation products in 3C libraries by multiplex LMA (Fig. 1A To analyze chromatin interactions by 5C, a 3C library is first generated using the conventional 3C method. A mixture of 5C primers is then annealed onto the 3C library and ligated. Two types of 5C primers are used: 5C forward and 5C reverse primers. These primers are designed so that forward and reverse primers anneal across ligated junctions of head-to-head ligation products present in the 3C library (Fig. 1A,B Forward and reverse 5C primers are designed to contain a unique sequence corresponding to the sense and antisense strand of the 3′-ends of restriction fragments, respectively (Fig. 1B Chromatin looping in the human β-globin locus We developed and optimized the 5C approach by analyzing the human β-globin locus. This locus was selected because several looping interactions have previously been detected by 3C as well as by a second method, RNA-TRAP (Carter et al. 2002; Tolhuis et al. 2002), and therefore detection of these looping interactions using 5C can be used to validate the new method. The human β-globin locus consists of five developmentally regulated β-globin-like genes (ε, HBE; Aγ and Gγ, HBG1 and HBG2; δ, HBD; and β, HBB), one pseudogene (HBpsi), and a Locus Control Region (LCR) located upstream of the gene cluster (Fig. 2A
We first verified the presence of chromatin loops in the human β-globin locus using the conventional 3C method. The locus was analyzed in the erythroleukemia cell line K562 and in the EBV-transformed lymphoblastoid cell line GM06990. K562 cells express high levels of ε- and γ-globin, whereas GM06990 cells do not express the β-globin locus (Supplemental Fig. 1 The normalized results are presented in Figure 2B Finally, our analysis revealed less frequent random collisions between neighboring restriction fragments around the LCR in K562 cells as compared to GM06990 cells. We have observed similar differences in random collisions around the active and inactive FMR1 promoter (Gheldof et al. 2006) and have proposed that these differences may reflect transcription-dependent modulation of chromatin expansion or changes in subnuclear localization. Based on this analysis, we conclude that the conformation of the human β-globin locus is comparable to the murine locus with looping interactions between the LCR and 3′-HS1 in both expressing and nonexpressing blood-derived cells. The interaction between the LCR and the active Aγ-globin gene is only observed in globin-expressing K562 cells. LMA detection of 3C ligation products We used detection of chromatin loops in the β-globin locus to develop and optimize the 5C technology. LMA was first performed with a single pair of 5C forward and reverse primers to verify that this method can quantitatively detect a ligation product in the context of a 3C library. We designed a 5C primer pair that recognizes a ligation product that is formed by two adjacent restriction fragments located in the gene desert control region. LMA was performed with this primer pair in the presence of increasing amounts of 3C library (generated from K562 cells), and the formation of ligated forward and reverse primers was quantified by PCR amplification with the pair of universal T7 and T3 primers. We find that ligation of 5C primers is not observed when nonspecific DNA is present, is dependent on the amount of the 3C library, and requires Taq ligase (Fig. 2C,D LMA detection of looping between the LCR and the Aγ-globin gene We determined whether singleplex LMA can be used to quantitatively detect chromatin looping interactions in the β-globin locus. We focused on interactions involving the LCR and three diagnostic fragments in the β-globin locus (indicated as bars in Fig. 2A Interaction frequencies between HS5 and the three other sites in the β-globin locus were determined by calculating the amount of ligated 5C primers obtained with the 3C library and the amount obtained with the control library. The interaction frequency between the two adjacent restriction fragments located in the gene desert control region was used for normalization. Normalized interaction frequencies are shown in Figure 2E We then tested whether the four interaction frequencies studied here (three in the β-globin locus and one in the control region) can be detected and quantified in a multiplex LMA reaction. We performed LMA with a mix of six 5C primers and used PCR with specific primers to quantify the frequency with which specific pairs of 5C primers were ligated. We then calculated normalized interaction frequencies as described above. We again obtained similar results as with conventional 3C (Fig. 2E Generation of complex 5C libraries using highly multiplexed LMA Comprehensive 5C analysis of chromatin interactions throughout large genomic regions would require high levels of multiplexing in combination with a high-throughput method for analysis of 5C libraries. Therefore, we tested LMA at higher levels of multiplexing and explored two high-throughput detection methods to analyze 5C libraries: microarrays and quantitative DNA sequencing. We designed 5C reverse primers for each of the three EcoRI restriction fragments that overlap the LCR and 5C forward primers for 55 restriction fragments throughout a 400-kb region around the LCR. This primer design allows detection of looping interactions between each of the three sections of the LCR and the surrounding chromatin in parallel in a single experiment (see below). We also designed 10 5C forward and 10 5C reverse primers throughout a 100-kb region in the gene desert control region. Forward and reverse primers were designed to recognize alternating restriction fragments. This primer design scheme allows the detection of a matrix of interactions throughout the control region (see below). We performed LMA with a mixture of all 78 5C primers using 3C libraries from K562 and GM06990 and the control library as templates. Each 5C library contained up to 845 different 5C ligation products (the products of 13 reverse primers and 65 forward primers). These products included 165 possible interactions within the β-globin locus, 100 interactions throughout the gene desert, and 590 interactions between the two genomic regions. We verified that the 5C libraries represented quantitative copies of the selected fraction of the 3C libraries. To do this, we again analyzed the same set of four interaction frequencies as in Figure 2E
Analysis of 5C libraries on microarrays and by quantitative sequencing We tested whether microarray detection and quantitative sequencing can be used for comprehensive analysis of the composition of 5C libraries. First, to facilitate microarray detection, we amplified the 5C libraries described above with Cy3-labeled universal primers. The labeled 5C libraries were then hybridized to a custom-designed microarray that can detect specific 5C ligation products. Since each 5C product is composed of two half-sites, each corresponding to a 5C primer, cross-hybridization of non-specific 5C products can occur to probes that share one half-site. To assess half-site cross-hybridization, the microarray also contained probes that recognize only one of the 78 5C primers present in the library. To determine the optimal length of the microarray probes that allows the least cross-hybridization, each probe was spotted with 18 different lengths of half-sites ranging from 15 to 32 bases (total probe length ranging from 30 to 64 bases). 5C libraries were hybridized to the array, and specific and half-site hybridization was quantified. We found that probes that are composed of two half-sites with a length ranging from 19 to 24 bases displayed the lowest relative level of cross-hybridization of half-sites (see Supplemental Fig. 4
Second, we analyzed the composition of 5C libraries by quantitative sequencing. 5C libraries are composed of linear DNA molecules that all are ~100 bp long, which makes them ideally suited for high-throughput single-molecule pyrosequencing. We generated similar 5C libraries as used for microarray detection, except that five of the 65 forward primers were left out. We analyzed 5C and control libraries using the GS20 platform developed by 454 Life Sciences Corp. (Margulies et al. 2005). For each library, we obtained at least 160,000 sequence reads (Supplemental Table 4). For each sequence, we determined which pair of ligated 5C primers it represented and the number of times each specific 5C ligation product was sequenced was counted (see Methods; Supplemental Table 5). As each ligation product was sequenced many times (the median count for intrachromosomal interactions was 133 for K562, 53 for GM06990, and 134 for the control library), a quantitative determination of interaction frequencies was obtained. Interaction frequencies were calculated by dividing the number of times a 5C product was sequenced in a 5C library by the number of times it was sequenced in the control library (see Supplemental Table 6). Large-scale 5C analysis of the β-globin locus We first analyzed the interaction profiles of HS5 throughout the 100-kb β-globin locus as detected on the microarray (Fig. 3A Interestingly, 3C and 5C analyses also revealed strong interactions between the LCR and a region located between the γ- and δ-globin genes in K562 cells. This region contains the β-globin pseudogene, which is weakly expressed in K562 cells but is silent in GM06990, and the initiation site for an intergenic transcript (Gribnau et al. 2000). We compared the 5C and 3C data sets directly by calculating for each pair of interacting fragments the fold difference between their interaction frequencies as determined by 5C and 3C. We find that the difference in 5C data obtained by microarray detection and conventional 3C is generally less than twofold (Fig. 3D
Taken together, these results provide clear proof of principle that 5C in conjunction with microarray detection or quantitative sequencing is a powerful methodology to quantitatively detect chromatin interactions in a high-throughput setting. Interactions between HS5 and upstream elements The mouse LCR interacts with HS elements (HS-62.5/HS-60) located up to 40 kb upstream of the LCR (Tolhuis et al. 2002; Palstra et al. 2003). It is not known whether functionally equivalent interacting elements are present in this region of the human genome. It has been noted that olfactory receptor genes located ~90 kb upstream of the LCR are orthologous to ones located 40 kb upstream of the murine locus (Bulger et al. 2000), indicating that these regions are related. In addition, the murine HS-62.5/ HS-60 element is embedded in a sequence that is similar to a sequence located ~90 kb upstream of the human LCR (Bulger et al. 2003). These observations indicate that the region located ~90 kb upstream of the human LCR is orthologous to the region located ~40 kb upstream of the murine locus, suggesting that this region may also interact with the LCR in human cells. To assess in an unbiased fashion whether the LCR interacts with any upstream elements in the human locus, the 5C experiment described above was designed to include analysis of a large region located upstream of the LCR. We analyzed the interaction profiles obtained by microarray detection and quantitative sequencing of HS5 with a region up to 280 kb upstream of the LCR. Both data sets showed that interactions throughout this region are generally much lower than those observed between the LCR and the β-globin locus. However, in both cell lines, we did detect elevated interaction frequencies throughout a large domain located 50–100 kb upstream of the LCR (Fig. 4A These results suggest that the region located 50–100 kb upstream of the LCR in the human genome is in relatively close proximity to the LCR and therefore may be functionally equivalent to the genomic region located 40 kb upstream of the LCR in the murine locus. In addition, these results illustrate that large-scale mapping of interactions using 5C can greatly facilitate the discovery of the locations of novel putative regulatory elements. Parallel analysis of multiple interaction profiles A major advantage of 5C is the fact that interactions between multiple elements of interest and other genomic elements can be analyzed in parallel in a single experiment. The 5C experiment described here was designed to illustrate this aspect of the methodology. As described above, 5C forward and reverse primers were designed to allow simultaneous detection of interaction profiles of each of the three subsections of the LCR with the 400 kb of surrounding chromatin. Data obtained by microarray analysis and quantitative sequencing of 5C libraries showed that the interaction profile of the restriction fragment overlapping HS2/3/4 of the LCR fragment is very similar to that of HS5 (Fig. 4C Large-scale 5C analysis of the gene desert control region The 5C analysis of the β-globin locus was focused on the mapping of interactions between a fixed regulatory element, the LCR, and the surrounding chromatin. 5C experiments can also be designed so that a more global data set is obtained, which is particularly useful when the positions of regulatory elements are poorly defined. Here we provide an example of an alternative 5C primer design scheme that provides insights into the general spatial conformation of a genomic region. The 5C analysis described above included 10 forward and 10 reverse 5C primers for restriction fragments located in the gene desert control region (Fig. 5A The graphs in Figure 5 Discussion The development of 3C has greatly facilitated the detection and study of cis and trans interactions between genes and regulatory elements. Here we developed 5C technology, a novel extension of 3C that should significantly expand the range of 3C applications by allowing comprehensive and large-scale mapping of chromatin interactions. Large-scale application of 5C will provide information about relationships between genes and regulatory elements and can be used to identify novel regulatory elements and to reveal higher-order chromosome structural features. Validation of 5C through analysis of the β-globin locus We have validated 5C by detection of chromatin looping interactions in the human β-globin locus. The most prominent interaction was observed between the LCR and the expressed γ-globin genes, which was specifically observed in γ-globin-expressing K562 cells. In addition, in both K562 and GM06990 cells, the LCR also interacted with the 3′-HS1 element and a large domain located 50– 100 kb upstream. The latter region corresponds to the region around HS-62.5/ HS-60 in the murine locus (Farrell et al. 2002; Bulger et al. 2003). Similar long-range interactions between HSs were observed in the mouse. The clustering of these HSs has been proposed to create a chromatin hub, or a specialized nuclear compartment dedicated to the transcription of the β-globin genes (Tolhuis et al. 2002; de Laat and Grosveld 2003). Several of the HSs in the β-globin locus bind the insulator binding protein CTCF (Farrell et al. 2002; Bulger et al. 2003) (and for human HS5, see Supplemental Fig. 6), and this protein has been proposed to mediate their interactions and formation of the chromatin hub (Patrinos et al. 2004). However, the functional significance of some of these interactions is not well understood as deletion of some of the elements does not directly affect β-globin expression (Bender et al. 2006). Our observation that interactions between several of the HSs also occur in GM06990 cells, which will never express β-globin, suggests that they may not directly regulate β-globin expression but are involved in some aspect of higher-order chromosome architecture. The interaction profiles of HS5 and HS2/3/4 are very similar, except that HS2/3/4 interacted more frequently with the β-globin locus specifically in K562 cells. This result is in agreement with observations that HS2 and HS3 have the strongest enhancer activity (Fraser et al. 1993; Peterson et al. 1996). In addition, RNA-TRAP showed that the expressed globin gene interacted most strongly with HS2 (Carter et al. 2002). We identified a new chromatin looping interaction between the LCR and the region between the γ- and δ-globin genes. This result is intriguing given that this region has been implicated in developmental control of the β-globin locus (O'Neill et al. 1999; Chakalova et al. 2005). This region contains a β-globin pseudogene that is activated in K562 cells (Supplemental Fig. 1 Microarray detection and quantitative sequencing Results obtained with microarray detection, quantitative sequencing and semiquantitative PCR are generally very comparable. However, we also observed several differences. First, the dynamic range of microarray detection was smaller than that of quantitative sequencing, as has been observed before (Yuen et al. 2002). Second, small quantitative differences were observed between the data sets obtained by microarray analysis, quantitative sequencing, and semiquantitative PCR, e.g., in the γ–δ intergenic region (Figs. 3, 4 Both detection methods have advantages. DNA sequencing displays a larger dynamic range and obviates the need to design a specific array for each genomic region of interest. On the other hand, microarray analysis is currently more cost-effective, particularly when a given genomic region needs to be analyzed under a large number of different conditions. 5C data obtained by DNA sequencing allowed an estimate of the background of the LMA-based approach. We quantified 451 interactions between the β-globin locus and the control gene desert region, which are located on different chromosomes. These interactions are detected by forward primers located on one chromosome and reverse primers located on the other, and vice versa. There is no biological indication that the β-globin locus and the gene desert region should preferentially interact. Therefore, we reasoned that these interchromosomal interaction frequencies would correspond to background signals. Generally, we detected very low background interaction frequencies between the two regions (average interaction frequency 0.08 [standard error of the mean, or S.E.M. = 0.02] for K562 and 0.08 [S.E.M. = 0.01] for GM06990) (Supplemental Table 6), which is 75-fold lower than the interaction frequency between HS2/3/4 and the β-globin gene in K562 cells. However, we also detected a few higher interaction frequencies. We do not know whether these represent true interchromosomal contacts, false positives due to primer design, or experimental noise. Future large-scale 5C analyses will provide more detailed insights into these issues. 5C applications 5C relies on multiplexed ligation-mediated amplification and thus is potentially limited by the number of 5C primers that can be used in a single reaction. Other assays based on LMA have successfully used many thousands of primers in a single reaction. For example, the methylation status of 1534 CpG sites was assessed using a mixture of ~6000 primers (Bibikova et al. 2005). Another example is the use of highly multiplexed LMA with up to 20,000 Molecular Inversion Probes in a single reaction to detect single nucleotide polymorphisms (SNPs) (Hardenbol et al. 2005; Wang et al. 2005). When 5C is performed at a similar level of multiplexing, e.g., using 10,000 5C primers in a single experiment, up to 25 million distinct chromatin interactions can be detected in parallel involving up to 40 Mb (10,000 4-kb restriction fragments) of DNA. For highly multiplexed 5C analyses, it is important to carefully design 5C primers. Nine 5C primers that were used to generate the 5C libraries analyzed here perfectly recognized abundant interspersed repeats, and these primers were found to produce excessively large numbers of ligation products (see Supplemental Table 5B). Thus, repeated sequences must be avoided. We anticipate two types of 5C applications that are distinguished by 5C primer design schemes. First, 5C can be used for large-scale mapping of chromatin looping interactions between specific genomic elements of interest, similar to the analysis of the β-globin locus described here. Such studies are focused on mapping interactions between a “fixed” element (e.g., the LCR) and other restriction fragments located in cis or in trans, in order to identify elements with which it interacts. 5C allows simultaneous quantification of interaction profiles of many such “fixed” elements in parallel in a single reaction followed by analysis on a custom-designed microarray or by direct quantitative sequencing. To do this, reverse 5C primers are designed for each fixed fragment of interest, and forward 5C primers are designed for all other restriction fragments, as shown in Figure 4A Second, 5C analysis can be used to generate dense interaction maps that cover most or all potential interactions between all fragments of any genomic region. Dense interaction maps can provide a global overview of the conformation of a given genomic region. For example, when 5C forward and reverse primers are designed for alternating restriction fragments, as performed here for the gene desert control region (Fig. 5 Methods BAC selection and control library preparation A control library for the human β-globin locus and gene desert regions (ENCODE regions ENm009 and ENr313, respectively) was generated as described (Dekker 2006; Miele et al. 2006). Briefly, an array of bacterial artificial chromosomes (BACs) covering both genomic regions was mixed, digested with EcoRI, and randomly ligated. In this study, the BAC arrays from the β-globin locus and gene desert regions were mixed in a 4:1 ratio to obtain strong signals for the β-globin locus. Interaction frequencies were adjusted accordingly. The following seven BAC clones were used for the β-globin region: CTC-775N13, RP11-715G8, CTD-3048C22, CTD3055E11, CTD-2643I7, CTD-3234J1, and RP11-589G14. A set of four BAC clones was selected to cover the 0.5-Mb gene desert region, and include RP11-197K24, RP11-609A13, RP11-454G21, and CTD-2133M23. BAC clones were obtained from Invitrogen and the Children's Hospital Oakland Research Institute (CHORI). Cell culture and 3C analysis The GM06990 cell line was derived from EBV-transformed B-lymphocytes and was obtained from Coriell Cell Repositories (CCR). This cell line was cultured in Roswell Park Memorial Institute medium 1640 (RPMI 1640) supplemented with 2 mM L-glutamine and 15% fetal bovine serum (FBS). The K562 cell line was obtained from the American Type Culture Collection (ATCC) and cultured in RPMI 1640 supplemented with 2 mM L-glutamine and 10% FBS. Both cell lines were grown at 37°Cin5% CO2 in the presence of 1% penicillin-streptomycin. 3C analysis was performed with log-phase GM06990 and K562 cells using EcoRI as previously described (Dekker et al. 2002; Vakoc et al. 2005; Miele et al. 2006). The primer sequences are presented in Supplemental Table 2. Real-time PCR quantification Total RNA from log-phase cells was isolated with the RNeasy Mini Kit as described by the manufacturer (Qiagen). cDNA was synthesized with oligo(dT)20 (Invitrogen) using the Omniscript Reverse Transcription Kit (Qiagen). β-Globin transcripts were quantified by real-time PCR in the presence of SYBR Green I stain (Molecular Probes). The specific human β-globin primers used in this analysis are summarized in Supplemental Table 1. 5C primer design Forward and reverse primers were designed to recognize the top or bottom strand of the 3′-end of EcoRI restriction fragments. Primer homology lengths varied from 24 to 40 nucleotides, and melting temperatures were centered at 72°C. The genomic uniqueness of all primers was verified with the SSAHA algorithm (Ning et al. 2001). Forward 5C primers were designed to include a 5′-end tail that includes (5′–3′) CTG followed by one MmeI restriction site (TCCAAC) and a modified T7 Universal primer sequence (TAATACGACTCACTATAGCC). Reverse 5C primers were designed to include a 3′-end tail that includes (5′–3′) a modified complementary T3 Universal sequence (TCCCTTTAGT GAGGGTTAATA) and one MmeI restriction site (GTCGGA), followed by CTC. 5C forward and reverse primers each contained half of the EcoRI restriction site, and only the reverse primers were phosphorylated at the 5′-end. All 5C primers are presented in Supplemental Table 3. 5C library preparation The 3C library (representing ~150,000 genome copies) or control library (5 ng) was mixed with salmon testis DNA (Sigma) to a total DNA mass of 1.5 μg, and with 1.7 fmol of each 5C primer in a final volume of 10 μL of annealing buffer (20 mM Tris-acetate at pH 7.9, 50 mM potassium acetate, 10 mM magnesium acetate, and 1 mM DTT). Samples were denatured for 5 min at 95°C and annealed for 16 h at 48°C. Annealed primers were ligated for 1 h at 48°C by adding 20 μL of ligation buffer (25 mM Tris-HCl at pH 7.6, 31.25 mM potassium acetate, 12.5 mM magnesium acetate, 1.25 mM NAD, 12.5 mM DTT, 0.125% Triton X-100) containing 10 units of Taq DNA ligase (NEB). Reactions were terminated by incubating samples at 65°C for 10 min. 5C ligation products were amplified by PCR using forward (T7modif, CTGTCCAACTAA TACGACTCACTATAGCC) and reverse (T3modif, GAGTCCGAC TATTAACCCTCACTAAAGGGA) primers. Six microliters of ligation reaction were amplified with 10 pmol of each primer in 25-μL PCR reactions (32 cycles of 30 sec of denaturing at 95°C, 30 sec of annealing at 60°C, and 30 sec of extension at 72°C). 5C libraries were purified with MinElute Reaction Cleanup Kit (Qiagen) to remove unincorporated primers and other contaminants as recommended by the manufacturer. Singleplex and sixplex 5C analysis 5C libraries from K562 and GM06990 (each representing ~150,000 genomes) or control libraries (5 ng) were incubated with individual 5C primer pairs and processed as described above, except that ligation reactions were amplified by 35 PCR cycles of 30 sec of denaturing at 95°C, 30 sec of annealing at 60°C, and 30 sec of extension at 72°C. Amplified 5C ligation products were resolved on 2% agarose gels and visualized with ethidium bromide (0.5 μg/mL). Sixplex 5C analysis was performed by mixing six distinct 5C primers with 3C or control libraries. Individual 5C ligation products of sixplex samples were detected by PCR with specific internal PCR primers and measured on agarose gels as described above. Linear-range PCR detection of 5C products was verified by twofold serial dilution titrations of multiplex samples. The sequences of internal primers are available on request. 5C library microarray analysis 5C libraries were prepared by performing multiplex LMA with 78 5C primers, and amplified with a 5′-Cy3-labeled reverse PCR primer complementary to the common 3′-end tail sequence of reverse 5C primers (Cy3-T3modif). Maskless array synthesis and hybridization were carried out with 100 ng of amplified 5C libraries at NimbleGen Systems Inc. as previously described (Singh-Gasson et al. 1999; Nuwaysir et al. 2002; Kim et al. 2005; Selzer et al. 2005). Each array featured the sense strand of predicted 5C ligation products. The arrays also contained several inter-region negative controls. The arrays contained 18 replicates of increasing feature lengths ranging from 30 to 64 nt, which were used to identify optimal array probe lengths (Supplemental Fig. 4 5C library high-throughput DNA sequencing analysis 5C libraries were generated with 73 5C primers, as described in the Results section. Each library was amplified with 5′-end phosphorylated PCR primers and processed for single-molecule pyrosequencing as previously described (Margulies et al. 2005). Using the GS20 platform developed by 454 Life Sciences Corp., 550,189 sequence reads totaling 60 million bases were obtained. The mean read length was 108 bases (mode, 112 bases). Each read was BLASTed against all forward and reverse primers. For each sample, the number of reads that matched each of the 682 possible primer pairs (62 forward × 11 reverse) was counted. These combinations include 159 possible interactions in the β-globin locus, 72 interactions in the gene desert region, and 451 interregion interactions. The data are summarized in Supplemental Tables 4 and 5. ACKNOWLEDGMENTS We thank the members of our laboratories for stimulating and helpful discussions, and Drs. J. Perry, J. Teodoro, and M. Walhout for critical reading of the manuscript. This work was supported by grants from the National Institutes of Health to J.D. (NHGRI-ENCODE HG003143), to R.D.G. (NHGRI-ENCODE HG003129), and to A.K. (CA109597). R.A.A. was supported through the NSF/ NIH Joint Program in Mathematical Biology (NIH grant R01GM078986) and Jeffrey Epstein's sponsorship of the Program for Evolutionary Dynamics (Harvard University). Footnotes [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5571506. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Science. 2004 Oct 22; 306(5696):636-40.
[Science. 2004]Nature. 2005 Jun 2; 435(7042):637-45.
[Nature. 2005]Hum Mol Genet. 2005 Apr 15; 14 Spec No 1():R101-11.
[Hum Mol Genet. 2005]Trends Biochem Sci. 2003 Jun; 28(6):277-80.
[Trends Biochem Sci. 2003]Chromosome Res. 2003; 11(5):447-59.
[Chromosome Res. 2003]Curr Opin Cell Biol. 2004 Jun; 16(3):256-62.
[Curr Opin Cell Biol. 2004]Hum Mol Genet. 2005 Apr 15; 14 Spec No 1():R101-11.
[Hum Mol Genet. 2005]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Trends Biochem Sci. 2003 Jun; 28(6):277-80.
[Trends Biochem Sci. 2003]Methods Enzymol. 2004; 375():493-507.
[Methods Enzymol. 2004]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Mol Cell. 2002 Dec; 10(6):1453-65.
[Mol Cell. 2002]Nat Genet. 2003 Oct; 35(2):190-4.
[Nat Genet. 2003]Mol Cell. 2005 Feb 4; 17(3):453-62.
[Mol Cell. 2005]Nat Immunol. 2004 Oct; 5(10):1017-27.
[Nat Immunol. 2004]EMBO J. 1999 Feb 15; 18(4):949-58.
[EMBO J. 1999]Mol Cell. 2000 Feb; 5(2):377-86.
[Mol Cell. 2000]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Methods Enzymol. 2004; 375():493-507.
[Methods Enzymol. 2004]Mol Cell. 2005 Feb 4; 17(3):453-62.
[Mol Cell. 2005]Nat Methods. 2006 Jan; 3(1):17-21.
[Nat Methods. 2006]Nat Genet. 2003 Oct; 35(2):190-4.
[Nat Genet. 2003]Nat Methods. 2006 Jan; 3(1):17-21.
[Nat Methods. 2006]Science. 1988 Aug 26; 241(4869):1077-80.
[Science. 1988]Nucleic Acids Res. 2005 Oct 27; 33(19):e168.
[Nucleic Acids Res. 2005]Genome Biol. 2006; 7(7):R61.
[Genome Biol. 2006]Genome Res. 2004 May; 14(5):878-85.
[Genome Res. 2004]Genome Res. 2005 Feb; 15(2):269-75.
[Genome Res. 2005]Nat Genet. 2002 Dec; 32(4):623-6.
[Nat Genet. 2002]Mol Cell. 2002 Dec; 10(6):1453-65.
[Mol Cell. 2002]Blood. 2002 Nov 1; 100(9):3077-86.
[Blood. 2002]Genes Dev. 2004 Oct 15; 18(20):2485-90.
[Genes Dev. 2004]Mol Cell. 2005 Feb 4; 17(3):453-62.
[Mol Cell. 2005]Mol Cell. 2002 Dec; 10(6):1453-65.
[Mol Cell. 2002]Science. 2004 Oct 22; 306(5696):636-40.
[Science. 2004]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Nat Methods. 2006 Jan; 3(1):17-21.
[Nat Methods. 2006]Proc Natl Acad Sci U S A. 2006 Aug 15; 103(33):12463-8.
[Proc Natl Acad Sci U S A. 2006]Nat Genet. 2003 Oct; 35(2):190-4.
[Nat Genet. 2003]Proc Natl Acad Sci U S A. 2006 Aug 15; 103(33):12463-8.
[Proc Natl Acad Sci U S A. 2006]Nature. 2005 Sep 15; 437(7057):376-80.
[Nature. 2005]Mol Cell. 2000 Feb; 5(2):377-86.
[Mol Cell. 2000]Mol Cell. 2002 Dec; 10(6):1453-65.
[Mol Cell. 2002]Nat Genet. 2003 Oct; 35(2):190-4.
[Nat Genet. 2003]Proc Natl Acad Sci U S A. 2000 Dec 19; 97(26):14560-5.
[Proc Natl Acad Sci U S A. 2000]Mol Cell Biol. 2003 Aug; 23(15):5234-44.
[Mol Cell Biol. 2003]Mol Cell Biol. 2003 Aug; 23(15):5234-44.
[Mol Cell Biol. 2003]Trends Biochem Sci. 2001 Dec; 26(12):733-40.
[Trends Biochem Sci. 2001]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Nat Methods. 2006 Jan; 3(1):17-21.
[Nat Methods. 2006]Mol Cell Biol. 2002 Jun; 22(11):3820-31.
[Mol Cell Biol. 2002]Mol Cell Biol. 2003 Aug; 23(15):5234-44.
[Mol Cell Biol. 2003]Mol Cell. 2002 Dec; 10(6):1453-65.
[Mol Cell. 2002]Chromosome Res. 2003; 11(5):447-59.
[Chromosome Res. 2003]Genes Dev. 2004 Jun 15; 18(12):1495-509.
[Genes Dev. 2004]Genes Dev. 1993 Jan; 7(1):106-13.
[Genes Dev. 1993]Proc Natl Acad Sci U S A. 1996 Jun 25; 93(13):6605-9.
[Proc Natl Acad Sci U S A. 1996]Nat Genet. 2002 Dec; 32(4):623-6.
[Nat Genet. 2002]Proc Natl Acad Sci U S A. 1999 Jan 19; 96(2):349-54.
[Proc Natl Acad Sci U S A. 1999]Blood. 2005 Mar 1; 105(5):2154-60.
[Blood. 2005]Mol Cell. 2000 Feb; 5(2):377-86.
[Mol Cell. 2000]Nucleic Acids Res. 2002 May 15; 30(10):e48.
[Nucleic Acids Res. 2002]Genome Res. 2005 Feb; 15(2):269-75.
[Genome Res. 2005]Nucleic Acids Res. 2005 Nov 28; 33(21):e183.
[Nucleic Acids Res. 2005]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Proc Natl Acad Sci U S A. 2006 Aug 15; 103(33):12463-8.
[Proc Natl Acad Sci U S A. 2006]Nat Methods. 2006 Jan; 3(1):17-21.
[Nat Methods. 2006]Science. 2002 Feb 15; 295(5558):1306-11.
[Science. 2002]Mol Cell. 2005 Feb 4; 17(3):453-62.
[Mol Cell. 2005]Genome Res. 2001 Oct; 11(10):1725-9.
[Genome Res. 2001]Nat Biotechnol. 1999 Oct; 17(10):974-8.
[Nat Biotechnol. 1999]Genome Res. 2002 Nov; 12(11):1749-55.
[Genome Res. 2002]Nature. 2005 Aug 11; 436(7052):876-80.
[Nature. 2005]Genes Chromosomes Cancer. 2005 Nov; 44(3):305-19.
[Genes Chromosomes Cancer. 2005]Nature. 2005 Sep 15; 437(7057):376-80.
[Nature. 2005]Mol Cell Biol. 2002 Jun; 22(11):3820-31.
[Mol Cell Biol. 2002]Mol Cell Biol. 2003 Aug; 23(15):5234-44.
[Mol Cell Biol. 2003]