NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM1631185 Query DataSets for GSM1631185
Status Public on Sep 02, 2015
Title MCF-7 Hi-C
Sample type SRA
 
Source name Breast cancer cell line
Organism Homo sapiens
Characteristics cell line: MCF-7
Growth protocol MCF-10A cells were obtained from the Barbara Ann Karmanos Cancer Institute (Detroit, MI). The cells were maintained in monolayer in Dulbecco's modified Eagle's medium-F12 (DMEM/F12) (Invitrogen, 21041025) supplemented with 5% horse serum (Invitrogen, 16050122), 1% penicillin/streptomycin (Invitrogen, 15140122), 0.5 μg/ml hydrocortisone (Sigma, H-0888), 100 ng/ml cholera toxin (Sigma, C-8052), 10 μg/ml insulin (Sigma, I-1882), and 20 ng/ml recombinant human EGF (Peprotech, 100-15) as previously described (Debnath et al. 2003). MCF-7 cells were obtained from ATCC and were cultured in DMEM supplemented with 10% FBS and pen-strep.
Extracted molecule genomic DNA
Extraction protocol The Hi-C libraries were prepared as previously described (PMID:22652625).
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2000
 
Data processing Library strategy:
The Hi-C data were mapped with Bowtie and binned at 6.5Mb, 1Mb, 250kb, 100kb and 40kb non-overlapping genomic intervals. Iterative mapping and correction of Hi-C data were performed as previously described (Imakaev et al. 2012). Biological replicates showed high reproducibility (Pearson's correlation coefficient > 0.9 for 1Mb resolution data). Similarly, the first eigenvector comparison of the replicates showed high reproducibility. For the downstream analyses, sequences obtained from both biological replicates were pooled and ICE-corrected to serve as a combined dataset.
Compartment analysis: First, the z-scores of the interaction matrices at 250kb resolution were generated as described previously (Lieberman, 2009). Then, Pearson Correlation on the Z-score matrices was calculated. By performing principal component analysis (Lieberman-Aiden et al. 2009; Zhang et al. 2012), the first eigenvector typically represents the compartment profile(Lieberman-Aiden et al. 2009), where the positive and negative 1st eigenvalues represent different compartments. Gene density for each compartment was calculated to call the “A” and “B” compartmentalization.
TAD Calling: TAD calling was performed as calculating the “insulation” score of each bin using the 40kb resolution combined Hi-C data (as previously described. The mean of the interactions across each bin were calculated. By sliding a 1Mb x 1Mb (25bins x 25bins) square along the diagonal of the interaction matrix for every chromosome, we obtained the “insulation score” of the interaction matrix. Valleys in the insulation score indicated the depletion of Hi-C interactions occurring across a bin. These 40kb valleys represent the TAD boundaries. Based on the variation of boundaries between replicates, we chose to add a total of 160kb (80kb to each side) to the boundary to account for replicate variation. The final boundaries span a 200kb region. All boundaries with a boundary strength < 0.15 were excluded as they were considered weak and non-reproducible. The insulation plots for the biological replicates showed high reproducibility (Pearson correlation coefficient = 0.80 for MCF-7 and 0.90 for MCF-10A replicates), suggesting the robustness of the method. Similarly, the overlap of detected boundaries also showed high reproducibility between the biological replicates (~85% TAD boundary overlap for MCF-7 and ~91% for MCF-10A). Therefore, we used the combined Hi-C replicates for the TAD analyses.
Z-Score Calculation: We modeled the overall Hi-C decay with distance using a modified LOWESS method (alpha = 1%, IQR filter), as described previously (Sanyal et al. 2012). LOWESS calculates the weighted-average and weighted-standard deviation for every genomic distance and therefore normalizes for genomic distance signal bias.
Genome_build: hg19
Supplementary_files_format_and_content: Hi-C: tar ball archive of all normalized/corrected Hi-C data matrices binned at 40kb/250kb/1Mb, TAD boundaries at 40kb and genomic compartments at 250kb resolution
Supplementary_files_format_and_content: RNA-Seq: raw gene counts after RSEM analysis and the output file for the DeSeq2 analysis
 
Submission date Mar 10, 2015
Last update date May 15, 2019
Contact name Rasim Barutcu
Organization name ScitoVation
Street address 6 Davis Drive
City Durham
State/province NC
ZIP/Postal code 27709
Country USA
 
Platform ID GPL11154
Series (1)
GSE66733 Chromatin interaction analysis reveals changes in small chromosome and telomere clustering between epithelial and breast cancer cells
Relations
BioSample SAMN03397467
SRA SRX950725

Supplementary data files not provided
SRA Run SelectorHelp
Processed data are available on Series record
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap