GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM1407488

Query DataSets for GSM1407488

Status

Public on Oct 10, 2014

Title

2010111_TimePoint7_Blood

Sample type

SRA

Source name

Whole Blood

Organism

Macaca mulatta

Characteristics

time point: Time Point 7
gender: Male
mahpic non human primate individual id: RTi13

Treatment protocol

During the 100-day experiment, pyrimethamine was delivered (1 mg/kg) intramuscularly once on day 20, and for three successive days starting at days 52 and 90 (corresponding to time points 2,4, and 6), corresponding to predicted periods for sub-curative and curative experimental treatment regimens for malaria infection of macaques.

Growth protocol

Animals approved for use were moved into experimental housing 10 days prior to the start of the experiment.

Extracted molecule

total RNA

Extraction protocol

Bone marrow (1ml) was collected into 1.5 ml tubes with EDTA, and the mononuclear cells were purified by density gradient centrifugation on Lymphoprep (Stem Cell Technologies) solution and preserved in RLT buffer (Qiagen) to stabilize mRNA. Whole blood (3 ml) was collected in Tempus tubes (Applied Biosystems) that also preserve mRNA; these samples include erythrocytes, platelets and granulocytes in addition to mononuclear lymphocytes. RNA was extracted from the BM samples using Qiagen RNEasy Mini-Plus kits following the manufacturer-recommended procedures, and from PB samples using Tempus-Spin RNA isolation kits. The quality of all RNA samples was confirmed using a Bioanalyzer, with an RNA Integrity Number (RIN) greater than 8 recorded for all samples.
Approximately 1 μg of total RNA per sample was converted to double-stranded cDNA using poly-A beads to enrich for mRNA, and Illumina TruSeq Stranded mRNA Sample Prep kits to generate strand-specific libraries. As a quality control, 96 spike-in RNAs of known concentration and GC proportions (ERCC Spike-In Control, Life Technologies) were added to constitute approximately 1% of the total RNA for each library. Adapters were ligated to facilitate 3-plex sequencing on an Illumina HiSeq2000 at the Yerkes Genomics Core, aiming for 80 million paired-end 100 base pair (bp) reads per library.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina HiSeq 2000

Data processing

Bases were called with Illumina RTA (Real-Time Analysis, v1.13.48) with default parameters. FASTQC (v0.10.1) was used to assess data quality, but the data were not filtered at this stage.
To quantify gene expression, RNA-Seq reads were mapped to the listed assembly and annotation using Tophat2 (Trapnell et al, 2013). Default options were used with the exception that the command --library-type fr-secondstrand was used since reads were generated using a stranded library preparation method from Illumina. This allowed differentiation between sense and antisense transcripts. Only reads that map to a single location in the genome were included, to ensure high-confidence mapping.
Several quality control steps were used to verify the reliability of the data: linear correlation of estimated abundance of ERCC spike-in controls with known concentration; confirmation of 99.9% strand-specificity of the controls; less than 0.1% control fusion transcripts; and absence of 3’ bias in the controls was confirmed with RSeqC software (https://code.google.com/p/rseqc). Transcript abundance levels were inferred using HTSeq v0.5.4 (http://www-huber.embl.de/users/anders/HTSeq/doc/). HTSeq takes the short-read mapping file (bam) from tophat2 and the gene annotation file which contains the locations of all annotated genes. Since some libraries were sequenced more deeply than others, the libraries were normalized before determining differential gene expression using the gene level expression files with the default parameters of DESeq version 1.10.1 (http://www.bioconductor.org/packages/release/bioc/html/DESeq.html).
Genome_build: RNA-Seq reads were mapped to an early version of a new assembly (as of 5/2014) of the rhesus macaque (MacaM assembly, Version 4.0, created by Aleksey Zimin at the University of Maryland, Rob Norgren at the University of Nebraska Medical Center and their colleagues. The MacaM assembly has been deposited in GenBank under accession PRJNA214746 ID: 214746.
Supplementary_files_format_and_content: Excel files contain normalized transcript abundances, separately at the gene and exon levels, for each individual. Abundances are further classified by experimental Time Point (1-7), and Specimen Type (Whole Blood or Bone Marrow).
Supplementary_files_format_and_content: Gene Data Column headers are defined as follows: 'gene_name': Identifiers of all Genes in the reference annotation. 'gene_symbol': Symbols of all Genes in the reference annotation. 'Sample ID / Raw File 1 / Raw File 2 /Normalized Read Counts': Sample ID, Raw sequence file names for the sample, and Library size normalized and Log2 transformed read counts of all genes, from the Specimen Type, Individual ID, and Time Point listed directly below. Notes: There will be one read count column per specimen per individual per Time Point. So, 2 * 5 * 7 = 70 columns with read counts. 0 as value for read counts of some genes Overlapping genes (sharing exons) were "collapsed" into a single gene for the purposes of RNA-Seq read assignments. So, the read count at such a locus is representative of the cumulative expression of all the genes at that locus. BUT, the entire read count is assigned to only one of the genes and the others in the locus are assigned 0.
Supplementary_files_format_and_content: Exon Data Column headers are defined as follows: 'Exon ID': IDs were generated during analysis. Each row contains the Exon Identifier of one exon, for which expression data were recorded. IDs are composed of two parts separated by a colon. The first part is the symbol of the gene to which this exon belongs. The second part is a numerical identifier for the order of this exon in the listed gene. 'Start Location of Exon': Each row contains start location of one exon. Note that this location may not match the start location in the original annotation. See Notes section. 'End Location of Exon': Each row contains end location of one exon. Note that this location may not match the end location in the original annotation. See Notes section. 'Strand of Exon': Each row contains strand information of one exon. 'Gene ID': Each row contains the Gene Identifier of the gene for that exon. Note that only one gene ID is listed even when the exon is shared among multiple genes. See Notes section. 'Gene Symbol': Each row contains the Gene Symbol of the gene that the exon belongs to. Note that only one gene Symbol is listed even when the exon is shared among multiple genes. See Notes section. 'Gene and Transcript Membership of Exon': Each row contains ALL the genes and transcripts that one exon belongs to (has membership in). This column provides the useful information to identify the exons, as they are listed in the reference annotation. 'Sample ID / Raw File 1 / Raw File 2 / Normalized Read Counts': Sample ID, Raw sequence file names for the sample, and Library size normalized and Log2 transformed read counts of all exons. The column header also contains information regarding Specimen Type, Animal ID, and Time Point Each row contains the library size normalized and Log2 transformed read count observed for one exon, in the condition/sample. Notes: Number of read counts columns: There will be one raw read count column per specimen per individual per Time Point. So, 2 * 5 * 7 = 70 columns with read counts Number of exons listed: For the purpose of RNA-Seq read mapping, the exon boundaries and membership in genes were modified relative to the reference annotation. Essentially, it is ensured that every genomic location (nucleotide base) belonged to only one exon and each exon belonged to one and only one gene. Exons shared by multiple genes are listed only once, and their membership is assigned to only one gene (usually, the largest gene spanning the exon). But, the true membership of this exon, including all transcripts and genes that it is a part of is documented in the column "Gene and Transcript Membership of Exon". In the case of overlapping exons, artificial boundaries are created at the point of overlap so that the nucleotides belong to only one exon. Hence, the number of exons listed, boundaries of exons, and exon IDs are slightly different from the original annotation. Misc: Exons of length 1, etc Due to the adjustment of boundaries as described above, some exons are split, resulting in exons that are as small as one base long.

Submission date

Jun 10, 2014

Last update date

May 15, 2019

Contact name

Mary Galinski

Organization name

Emory University

Department

Vaccine Center at Yerkes

Lab

Galinski Lab

Street address

954 Gatewood Road

City

Atlanta

State/province

ZIP/Postal code

30329

Country

USA

Platform ID

GPL14954

Series (2)

GSE58340	Malaria Host Pathogen Interaction Center Experiment 13: Gene and exon transcript abundances of uninfected Macaca mulatta treated with pyrimethamine over 7 time points in a 100 day study
GSE94274	An Integrated Approach to Understanding Host-Pathogen Interactions

Relations

BioSample

SAMN02848768

SRA

SRX584613

Supplementary data files not provided

SRA Run Selector

Processed data are available on Series record

Raw data are available in SRA