GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE32038 Query DataSets for GSE32038
Status Public on Sep 10, 2011
Title Differential Gene and Transcript Expression Analysis with TopHat and Cufflinks
Organism Drosophila melanogaster
Experiment type Expression profiling by high throughput sequencing
Summary This submission includes the sample data for a protocol covering differential expression analysis with TopHat and Cufflinks. The protocol also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-Seq analysis results. While the procedure assumes basic informatics skills, these tools assume little to no background with RNA-Seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results.
Overall design The example data was generated in silico to closely resemble a real experiment in Drosophila melanogaster. First, expression values in cultured S2 cells were calculated for FlyBase 5.2 transcripts. These values were used to generate 3 sequencing replicates for condition "C1", with underlying variability in expression across replicates simulated by fitting a negative binomial model through the real S2 read count data. A second simulated condition "C2" was generated by perturbing expression for 300 randomly selected genes. Genes were perturbed by selecting the most highly expressed isoform and increasing its relative expression by three fold. Three replicates of this condtion were sequenced as above. Simulated sequencing was performed by picking a transcript from the FlyBase transcriptome with equal to its abundance, choosing a fragment length from a normal distribution with mean = 180bp and standard deviation = 20bp, and then choosing a start point for the fragment within the transcript uniformly at random. Total sequencing yield for each replicate was chosen to match that of the real S2 data. Each replicate was mapped to the fly genome with TopHat v 1.3.1 seperately. The replicates were assembled seperately with Cufflinks v 1.1.0. The replicate assemblies were merged with Cuffmerge. This merged assembly was then analysed for differentially expressed and regulated genes with Cuffdiff.
Contributor(s) Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L
Citation(s) 22383036
Submission date Sep 09, 2011
Last update date Feb 15, 2019
Contact name Cole Trapnell
Organization name Harvard University
Department Stem Cell and Regenerative Biology
Lab John Rinn
Street address 7 Divinity Ave
City Cambridge
State/province MA
ZIP/Postal code 02138
Country USA
Platforms (1)
GPL11203 Illumina Genome Analyzer IIx (Drosophila melanogaster)
Samples (6)
GSM794483 Simulated Condition 1, replicate 1
GSM794484 Simulated Condition 1, replicate 2
GSM794485 Simulated Condition 1, replicate 3
BioProject PRJNA147681

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE32038_RAW.tar 1.9 Gb (http)(custom) TAR (of BAM, GTF)
GSE32038_cds.diff.gz 225.6 Kb (ftp)(http) DIFF
GSE32038_cds.fpkm_tracking.gz 521.5 Kb (ftp)(http) FPKM_TRACKING
GSE32038_cds_exp.diff.gz 458.6 Kb (ftp)(http) DIFF
GSE32038_gene_exp.diff.gz 505.8 Kb (ftp)(http) DIFF
GSE32038_genes.fpkm_tracking.gz 565.2 Kb (ftp)(http) FPKM_TRACKING
GSE32038_isoform_exp.diff.gz 714.6 Kb (ftp)(http) DIFF
GSE32038_isoforms.fpkm_tracking.gz 1.0 Mb (ftp)(http) FPKM_TRACKING
GSE32038_merged.gtf.gz 1.9 Mb (ftp)(http) GTF
GSE32038_promoters.diff.gz 276.3 Kb (ftp)(http) DIFF
GSE32038_simulated_fastq_files.tar.gz 1.6 Gb (ftp)(http) TAR
GSE32038_splicing.diff.gz 303.9 Kb (ftp)(http) DIFF
GSE32038_tss_group_exp.diff.gz 569.0 Kb (ftp)(http) DIFF
GSE32038_tss_groups.fpkm_tracking.gz 659.8 Kb (ftp)(http) FPKM_TRACKING
Processed data provided as supplementary file
Processed data is available on Series record
Raw data is available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap