NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM1942109 Query DataSets for GSM1942109
Status Public on Aug 23, 2016
Title MCF-7 shRUNX1 RNA-seq Replicate3
Sample type SRA
 
Source name Human breast cancer cell line
Organism Homo sapiens
Characteristics cell line: MCF7
Growth protocol The MCF-7 cells were obtained from ATCC and were cultured in DMEM supplemental with 10% fetal bovine serum and 5% penicillin/streptomycin. For shRNA-mediated knockdown of Runx1 expression, MCF7 cells were plated in six-well plates (1x105 cells per well) and infected 24 h later with lentivirus expressing shRunx1 or nonspecific shRNA (Thermo Scientific). Briefly, cells were treated with 0.5 ml of lentivirus and 1.5 ml complete fresh DMEM high glucose per well with a final concentration of 4 μg/ml polybrene. Plates were centrifuged upon addition of the virus at 1460 × g at 37 C for 30 min. Infection efficiency was monitored by GFP co-expression at 2 days after infection. Cells were selected with 2 μg/ml puromycin for at least two additional days. After removal of the floating cells, the remaining attached cells were passed and analyzed.
Extracted molecule total RNA
Extraction protocol Hi-C was sequenced using PE100 reads; RNA-seq and ChIP-seq were sequenced using SE100 reads by using a HiSeq 2000 instrument
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina HiSeq 2000
 
Data processing RNA-seq: For RNA-Seq analysis, first, the adapter sequences were removed from the RNA-seq reads.  Ribosomal RNA reads were then filtered out using Bowtie (Song et al. 2014).  After these filtration and adapter removal steps, the reads were aligned to a transciptome and quantified using RSEM v1.2.7 (Li and Dewey 2011). Differential gene expression was calculated by using the Deseq2 package in R 3.1.0 by using the mean value of gene-wise dispersion estimates (Love et al. 2014).
ChIP-seq: ~1x107 parental MCF-7 cells were crosslinked with formaldehyde at room temperature for 10 minutes. Then, the cells were lysed using lysis buffer A (50mM HEPES, 140mM NaCl, 1mM EDTA pH=8, 10% Glycerol, 0.5% NP-40, 0.25% Triton X-100), and the residual cytoplasmic protein was removed using lysis buffer B (10mM Tris-HCl pH=8, 200mM NaCl, 1mM EDTA, 1mM EGTA). The nuclear fraction was released using lysis buffer C (10mM Tris-Hcl pH=8, 100mM NaCl, 1mM EDTA, 1mM EGTA, 0.1% Sodium Deoxycholate, 0.5% N-lauroylsarcosine). The chromatin was then sheared by using a Covaris S2 instrument with 10% duty cycle, 5 intensity, 200 cycles per burst, frequency sweeping mode, 60 second process time and for 4 cycles. The pull-down was performed by using the RUNX1 antibody (Cell Signaling #4334). Samples were washed three times with RIPA buffer (Tris-HCl pH=8, 150mM NaCl, 1mM EDTA, 1% NP-40, 0.25% Sodium deoxycholate, 0.1% SDS) and were eluted. The pull-down and input control sequencing libraries were generated by using the NEXTflex Rapid DNA Sequencing Kit (Bioo Scientific #5144-02) and were sequenced by using SE100 reads with a HiSeq 2000 instrument.
data processing step: The Hi-C data were mapped with Bowtie and binned at 6.5Mb, 1Mb, 250kb, 100kb and 40kb non-overlapping genomic intervals. Iterative mapping and correction of Hi-C data were performed as previously described (Imakaev et al. 2012). Biological replicates showed high reproducibility (Pearson's correlation coefficient > 0.9 for 1Mb resolution data). Similarly, the first eigenvector comparison of the replicates showed high reproducibility. For the downstream analyses, sequences obtained from both biological replicates were pooled and ICE-corrected to serve as a combined dataset.
Compartment analysis: First, the z-scores of the interaction matrices at 250kb resolution were generated as described previously (Lieberman, 2009). Then, Pearson Correlation on the Z-score matrices was calculated. By performing principal component analysis (Lieberman-Aiden et al. 2009; Zhang et al. 2012), the first eigenvector typically represents the compartment profile(Lieberman-Aiden et al. 2009), where the positive and negative 1st eigenvalues represent different compartments. Gene density for each compartment was calculated to call the “A” and “B” compartmentalization.
TAD Calling : TAD calling was performed as calculating the “insulation” score of each bin using the 40kb resolution combined Hi-C data (as previously described. The mean of the interactions across each bin were calculated. By sliding a 1Mb x 1Mb (25bins x 25bins) square along the diagonal of the interaction matrix for every chromosome, we obtained the “insulation score” of the interaction matrix. Valleys in the insulation score indicated the depletion of Hi-C interactions occurring across a bin. These 40kb valleys represent the TAD boundaries. Based on the variation of boundaries between replicates, we chose to add a total of 160kb (80kb to each side) to the boundary to account for replicate variation. The final boundaries span a 200kb region. All boundaries with a boundary strength < 0.15 were excluded as they were considered weak and non-reproducible. The insulation plots for the biological replicates showed high reproducibility, suggesting the robustness of the method. Similarly, the overlap of detected boundaries also showed high reproducibility between the biological replicates. Therefore, we used the combined Hi-C replicates for the TAD analyses.
Z-Score Calculation: We modeled the overall Hi-C decay with distance using a modified LOWESS method (alpha = 1%, IQR filter), as described previously (Sanyal et al. 2012). LOWESS calculates the weighted-average and weighted-standard deviation for every genomic distance and therefore normalizes for genomic distance signal bias.
Genome_build: hg19
Supplementary_files_format_and_content: HiC: Interaction matrices at 2.5Mb, 1Mb, 250kb and 40kb resolution; RNA-seq: Expected count file for all genes and a txt file showing the shRUNX1/shNS log2 Fold Change and pvalues for all genes; ChIP-seq: BedGeapgh UCSC visualization files and the MCF7 RUNX1 peak bed file
Supplementary_files_format_and_content: MCF7_shRUNX1_shNS_processed_HiC_files - all normalized/corrected Hi-C data matrices binned at 40kb/250kb/1Mb/2.5Mb
Supplementary_files_format_and_content: MCF7_RUNX1_ChIPseq_peaks.bed - BED file of RUNX1 ChIP-seq peaks
Supplementary_files_format_and_content: MCF7_shNS_shRUNX1_genes_expression_expected_count.tsv - TSV file showing the RNA-seq counts for each gene
Supplementary_files_format_and_content: MCF7_shRUNX1_shNS_RNAseq_log2_foldchange.txt - TSV file showing the log2 fold change of all genes between shRUNX1 and shNS samples
 
Submission date Nov 16, 2015
Last update date May 15, 2019
Contact name Rasim Barutcu
Organization name ScitoVation
Street address 6 Davis Drive
City Durham
State/province NC
ZIP/Postal code 27709
Country USA
 
Platform ID GPL11154
Series (1)
GSE75070 RUNX1 contributes to higher-order chromatin organization and gene regulation in breast cancer cells.
Relations
BioSample SAMN04272130
SRA SRX1434850

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap