NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE117217 Query DataSets for GSE117217
Status Public on Jul 18, 2018
Title Remapping the SRA: Drosophila melanogaster RNA-Seq data from the Sequence Read Archive
Organism Drosophila melanogaster
Experiment type Third-party reanalysis
Expression profiling by high throughput sequencing
Summary The sequence read archive (SRA) contains over 52 terabases or 482 billion reads from Drosophila melanogaster (as of June 2018). These data are massively underused by the community and include 14,423 RNA-Seq samples, that is roughly 7 times the size of modENCODE. Currently the major challenge is finding high quality datasets that are suitable for inclusion in new studies. To help the community overcome this hurdle, we re-processed all D. melanogaster RNA-Seq SRA experiments (SRXs) using an identical workflow. This workflow uses a data driven approach to identify technical metadata (i.e., strandedness and layout) for each sample in order to optimize mapping parameters. The workflow generates QC metrics, coverage tracks based on the dm6 assembly, and calculates gene level, junction level, and intergenic counts against FlyBase r6.11.  This resource will allow any researcher to visualize browser tracks for any publicly available dataset, quickly identify high quality data sets for use in their own research, and download identically processed counts tables. There is a treasure trove of underused data sitting in the SRA and this work addresses the first challenge to make data integration a common laboratory practice.
 
Overall design Published Drosophila melanogaster RNA-seq data were re-mapped to dm6 and processed with an identical work flow.
 
Contributor(s) Fear JM, Oliver B
Citation missing Has this study been published? Please login to update or notify GEO.
Submission date Jul 17, 2018
Last update date Mar 25, 2019
Contact name Brian Oliver
E-mail(s) briano@nih.gov
Phone 301-204-9463
Organization name NIDDK, NIH
Department LBG
Lab Developmental Genomics
Street address 50 South Drive
City Bethesda
State/province MD
ZIP/Postal code 20892
Country USA
 
Platforms (18)
GPL9058 Illumina Genome Analyzer (Drosophila melanogaster)
GPL9061 Illumina Genome Analyzer II (Drosophila melanogaster)
GPL9333 454 GS FLX (Drosophila melanogaster)
Samples (14423)
GSM3273571 DRX013093
GSM3273572 DRX013094
GSM3273573 DRX014765
Relations
BioProject PRJNA481740

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE117217_RAW.tar 481.1 Gb (http)(custom) TAR (of BW, TXT)
GSE117217_dmel_r6-11.intergenic.gtf.gz 152.4 Kb (ftp)(http) GTF
GSE117217_gene_counts.tsv.gz 230.2 Mb (ftp)(http) TSV
GSE117217_intergenic_counts.tsv.gz 41.7 Mb (ftp)(http) TSV
GSE117217_supplemental_metadata.tsv.gz 382.4 Kb (ftp)(http) TSV
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap