NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2047377 Query DataSets for GSM2047377
Status Public on Jul 08, 2016
Title NA19098-r1-E07
Sample type SRA
 
Source name LCL-derived iPSC
Organism Homo sapiens
Characteristics individual: NA19098
replicate: r1
well: E07
cell type: LCL-derived iPSC
Biomaterial provider Coriell http://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=NA19098
Growth protocol Undifferentiated feeder-free iPSCs generated from Yoruba LCLs were grown in E8 medium (Life Tech) (G. Chen et al. 2011) on Matrigel-coated tissue culture plates with daily media feeding at 37 °C with 5% (vol/col) CO2. For standard maintenance, cells were split every 3-4 days using cell release solution (0.5 mM EDTA and NaCl in PBS) at the confluence of roughly 80%. For the single cell suspension, iPSCs were individualized by Accutase Cell Detachment Solution (BD) for 5-7 minutes at 37 °C and washed twice with E8 media immediately before each experiment. Cell viability and cell counts were then measured by the Automated Cell Counter (Bio-Rad) to generate resuspension densities of 2.5 X 105 cells/mL in E8 medium for C1 cell capture.
Extracted molecule polyA RNA
Extraction protocol Single cell loading and capture was performed following the Fluidigm manual "Using C1 to Generate Single-Cell cDNA Libraries for mRNA Sequencing Protocol" (PN 100-7168). Briefly, 30 ul of C1 Suspension Reagent was added to a 70-ul aliquot of ~17,500 cells. Five ul of this cell mix were loaded onto 10-17 um C1 Single-Cell Auto Prep IFC microfluidic chip (Fluidigm), and the chip was then processed on a C1 instrument using the cell-loading script according to the manufacturer's instructions. Using the standard staining script, the iPSCs were stained with StainAlive TRA-1-60 Antibody (Stemgent, PN 09-0068). The capture efficiency and TRA-1-60 staining were then inspected using the EVOS FL Cell Imaging System (ThermoFisher)(supplemental Table X). Immediately after imaging, reverse transcription and cDNA amplification were performed in the C1 system using the SMARTer PCR cDNA Synthesis kit (Clontech) and the Advantage 2 PCR kit (Clontech) according to the instructions in the Fluidigm user manual with minor changes to incorporate UMI labeling (Islam et al. 2014). Specifically, the reverse transcription primer and the 1:50,000 Ambion® ERCC Spike-In Mix1 (Life Tech) were added to the lysis buffer, and the template-switching oligos which contain the UMI (5-bp random sequence) were included in the reverse transcription mix. When the run finished, full-length, amplified, single-cell cDNA libraries were harvested in a total of approximately 13 ul C1 Harvesting Reagent and quantified using DNA High Sensitivity LabChip (Caliper). A bulk sample, a 40 ul aliquot of ~10,000 cells, was collected in parallel with each C1 chip using the same reaction mixes following the C1 protocol of "Tube Controls with Purified RNA" (PN 100-7168, Appendix A).
For sequencing library preparation, fragmentation and isolation of 5' fragments were performed according to the UMI protocol (Islam et al. 2014). Instead of using commercial available Tn5 transposase, Tn5 protein stock was freshly purified in house using the IMPACT system (pTXB1, NEB) following the protocol previously described (Picelli et al. 2014). The activity of Tn5 was tested and shown to be comparable with the EZ-Tn5-Transposase (Epicentre). Importantly, all the libraries in this study were generated using the same batch of Tn5 protein purification. For each of the bulk samples, two libraries were generated using two different indices in order to get sufficient material. All of the 18 bulk libraries were then pooled and labelled as the "bulk" for sequencing.
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina HiSeq 2500
 
Description Processed data files:
reads-raw-single-per-lane.txt
reads-raw-single-per-sample.txt
molecules-raw-single-per-lane.txt
molecules-raw-single-per-sample.txt
Data processing We used umitools v2.1.1 (https://github.com/brwnj/umitools/releases/tag/v2.1.1) to remove the 5 bp UMI at the 5' end of the reads.
We used sickle version 1.33 (https://github.com/najoshi/sickle/releases/tag/v1.33) to perform quality trimming of the 3' end of the reads (flag -x). We used the default quality thresholds.
We used Subjunc version 1.5.0-p1 (http://sourceforge.net/projects/subread/files/subread-1.5.0-p1/) to map reads to both human genome hg19 (chromosomes 1-22, X, Y, M) and the ERRC spike-ins (http://tools.invitrogen.com/downloads/ERCC92.fa). We used the default thresholds and only kept uniquely mapping reads (flag -u).
We used featureCounts version 1.5.0-p1 (http://sourceforge.net/projects/subread/files/subread-1.5.0-p1/) to count the number of reads for all protein-coding genes (Ensembl GRCh37 release 82) and the ERCC spike-in genes. We performed strand-specific counting (flag -s 1) because the UMI protocol preserves sequence strand information.
In addition to read counts, we utilized the UMI information to obtain molecule counts for the single cell samples only. We did not count molecules for the bulk samples because they violate the assumptions of the UMI protocol, i.e. they contain too many unique molecules for the 1,024 UMIs to properly tag them all. First, we combined all reads for a given single cell using samtools version 0.1.18-dev (r982:313) (http://sourceforge.net/projects/samtools/files/samtools/0.1.18/). Next, we converted read counts to molecule counts using UMI-tools (https://github.com/CGATOxford/UMI-tools). UMI-tools counts the number of UMIs at each read start position. Furthermore, it accounts for sequencing errors in the UMIs introduced during the PCR amplication or sequencing steps using a "directional adjacency" method. Briefly, all UMIs at a given read start position are connected in a network using an edit distance of one base pair. However, edges between nodes (the UMIs) are only formed if the nodes have less than a 2x difference in reads. The node with the highest number of reads is counted as a unique molecule, and then it and all connected nodes are removed from the network. This is repeated until all nodes have been counted or removed.
Genome_build: hg19
Supplementary_files_format_and_content: The processed data files are tab-delimited text files which contain the raw counts for all protein-coding genes (Ensembl GRCh37 release 82) and the ERCC spike-in genes. Each row corresponds to a sample. Each column contains either sample metadata or counts for a given gene. The filename denotes the type of counts it contains. "reads" are number of sequences per gene, as per traditional RNA-seq; wherease, molecules are the number of UMIs per gene. "bulk" is sequencing of a population of cells, as per traditional RNA-seq; whereas, "single" is sequencing of single cells. "lane" is the gene counts for a given sample from one lane of sequencing; whereas, "sample" is the sum of the gene counts from all lanes for a given sample.
 
Submission date Jan 27, 2016
Last update date May 15, 2019
Contact name John D Blischak
Organization name University of Chicago
Department Human Genetics
Lab Gilad
Street address 920 E. 58th Street, CLSC 317
City Chicago
State/province IL
ZIP/Postal code 60615
Country USA
 
Platform ID GPL16791
Series (1)
GSE77288 Batch effects and the effective design of single-cell gene expression studies
Relations
BioSample SAMN04442868
SRA SRX1548491

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap