Sample GSM2838584 Query DataSets for GSM2838584
Status Public on Apr 30, 2018
Title mRG3
Sample type SRA
Source name Flp-In T-Rex 293
Organism Homo sapiens
Characteristics cell line: Flp-In T-Rex 293
treatment: mutant-DDX3X
Treatment protocol Protein expression induced with 0.1ug/ml Doxycycline.
Growth protocol Cells were grown to subconfluence in DMEM 10% FCS.
Extracted molecule cytoplasmic RNA
Extraction protocol RNA extracted using TRI® reagent (Sigma-Aldrich) as per manufacturer's instructions. iCLIP performed as in Huppertz et al., Methods, 2014 with changes allowing cytoplasmic RNP isolation (Herdy et al., Nat. Immunol, 2012).
RNA-seq libraries were generated using the Illumina TruSeq Stranded total RNA kit (Illumina, cat #RS-122-2301) as per the manufacturer’s instructions. iCLIP libraries prepared as in Huppertz et al., Methods, 2014 with RT buffer allowing efficient quadruplex reverse transcription (Kwok et al., Nat. Methods, 2016). Sequenced using an Illumina NextSeq 500.
Library strategy OTHER
Library source transcriptomic
Library selection other
Instrument model Illumina NextSeq 500
Data processing Library strategy: iCLIP-Seq
RNA-seq: fastq raw sequencing files were preprocessed using cutadapt to remove sequencing adapters and low quality sequencing tails (options –q 10). Trimmed files were aligned to the human genome (GRCh37/hg19) using tophat2 and using the UCSC gtf file provided by Illumina iGenomes as an annotation file ( Gene counts were calculated using htseq-count and the same gtf file. Differential expression analysis was done using the R package edgeR.
iCLIP-seq: identical reads were removed from the file, libraries were demultiplexed according to their 4-nucleotide patter sequence at the 5' end of each read (e.g., N3-GGTT-N2) and were then pre-processed with cutadapt to cut 3' sequencing adapters and low quality sequencing tails. Highly repetitive reads, i.e. those having at least 10 equal nucleotides (e.g., A{10,n}), were removed and reads aligned to the hg19 version of the human genome using bwa mem. After alignment, reads with MAPQ < 10 and those aligning to the same position while also having the same barcode were removed, as they constitute most likely PCR duplicates. Regions with signal above 10 were extracted, and regions closer than 30nt were merged into a single peak region. Merged peak regions less than 30nt in width were filtered out. Additionally, libraries were aligned to the transcriptome and processed: from filtered bwa-aligned bam files, only reads aligning with MAPQ >= 10 to expressed transcripts were retained. Expressed transcripts were considered as those having at least an average FPRM of 0.1 per condition (WT, mutant or negative) from the RNA-seq libraries, as assessed by cufflink isoform quantification. The remaining unfiltered reads were aligned to the expressed transcriptome using the software rsem. Coverage transcript files were calculated and normalized for the total estimated count in each iCLIP library, and peaks were called similarly to what done previously for the genome alignments. For peaks below 100 nt, the 100 base pairs around the middle of the peak were considered as binding regions, and sequences extracted. WT or mutant specific peaks were computer with the bedtools.
Genome_build: hg19 (GRCh37)
Supplementary_files_format_and_content: *.peaks.bed
Supplementary_files_format_and_content: *.norm.bedGraph
Supplementary_files_format_and_content: *_hg19_accepted_hits.nsort.htseq.txt: Raw counts.
Submission date Nov 02, 2017
Last update date May 15, 2019
Contact name Giovanni Marsico
Organization name CRUK Cambridge Institute
Street address Robinson Way
City Cambridge
ZIP/Postal code CB2 0RE
Country United Kingdom
Platform ID GPL18573
Series (1)
GSE106476 RG/RGG boxes are common binding motifs in RNA-G-quadruplex-interacting proteins
BioSample SAMN07977230
SRA SRX3362124

Supplementary file Size Download File type/resource
GSM2838584_pc_lnc_mRG3.norm.bedGraph.gz 2.1 Mb (ftp)(http) BEDGRAPH
GSM2838584_pc_lnc_mRG3.peaks.bed.gz 65.7 Kb (ftp)(http) BED
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

