NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM7501618 Query DataSets for GSM7501618
Status Public on Jun 21, 2023
Title VarsSeqRNA
Sample type SRA
 
Source name Library integrated in PLH001
Organism Saccharomyces cerevisiae
Characteristics strain: Library integrated in PLH001
Growth protocol Intron variant libraries were integrated into the genome of the S. cerevisiae strain PLH001 using homologous recombination and CRISPR/Cas9-induced double stranded breaks to increase efficiency. Our library included seven sub-libraries with variants of the introns in the following genes: QCR9, RPL28, RPS9B, RPL7A, RPS9A, RPS14B, and RPL36B. Integrated constructs included intron variants in their complete gene context with randomized 12N barcodes upstream of introns in 5' UTR regions. For both DNA and RNA sequencing, overnight cultures were grown from sub-library glycerol stocks at 30 oC in YPD liquid medium, diluted to OD600 0.1, and grown to OD600 0.5-0.6.
Extracted molecule total RNA
Extraction protocol RNA was extracted using the YeaStar RNA Kit (Zymo Research), using 5 μL of Zymolase for 5 mL starting cell culture.
We first depleted the extracted RNA of rRNA using RNase H to deplete rRNA with complementary oligos.We used an RNA Clean and Concentrator-5 (Zymo) column with the size-selection protocol to exclude RNA below a size cutoff of 200 nucleotides. RNA was fragmented with Ambion fragmentation reagents, end-repaired withrSAP (NEB), and then ligated to an adenylated universal DNA cloning linker withT4 RNA ligase 2 truncated KQ (NEB). Excess DNA linker was degraded with 5’ Deadenylase (NEB) and RecJf (NEB). We reverse transcribed RNA with TGIRT enzyme (InGex), using a reverse transcription primer that included a 10N interval for unique molecular identifiers (UMIs). cDNA was amplified using targeted PCR primers for each sub-library, first adding i5/i7 indices along with Illumina adapters in a short indexing PCR. We then carried out an additional PCR reaction with P5/P7 primers to generate the final library and size-selected the library using RNACleanXP beads (Beckman Coulter).
 
Library strategy OTHER
Library source transcriptomic
Library selection other
Instrument model Illumina NovaSeq 6000
 
Data processing UMI-tools was used to extract randomized barcode sequences (one for each transformant) and UMI's (one for each cDNA molecule) from sequencing reads. cutadapt was used remove adapter sequences including indexing primer sequences and the universalcloning linker, additionally trimming bases with Q-score lower than 20.
Sequencing reads were aligned to spliced isoform reference sequences for each intron. For reference sequence annotations, we collated all possible isoforms capturing spliced and unspliced transcripts for each gene (4 isoforms for two-intron RPL7A and 2 isoforms for all other genes), and we generated gff annotation files with gmap. TopHat2 was used to align sequencing reads to these reference isoform annotations, using the following flags: --no-novel-juncs -T --bp-mp 3,1 --b2-rdg 5,1 --b2-rfg 5,1 --segment-length 20 --segment-mismatches 3 --read-gap-length 7 --read-edit-dist 50 -m 1 --max-insertion-length 19 --max-deletion-length 19. We generated fasta files from the resulting alignment files with samtools.
Each aligned sequence was classified into one of three categories: unspliced, spliced at the expected junction, or other. We required an exact agreement with 14 nucleotides representing the unspliced junction or spliced junction. For each barcode, we computed the number of spliced and unspliced reads after deduplicating UMI's. For each barcode, we then computed the retained intron fraction (ratio between unspliced and total read counts) along with the normalized mRNA level (ratio between spliced read count and the transformation frequency from DNA sequencing.)
Assembly: S. cerevisiae intron reference sequences were obtained from Talkish, et al. 2019 (doi: 10.1371/journal.pgen.1008249).Coding ORF annotations were obtained from the Saccharomyces Genome Database for the S288C reference genome. Pre-mRNA reference sequences were obtained using the sacCer3 UCSC genome assembly.
Supplementary files format and content: For each sub-library, in files ending with barcode_umi_spliced.txt we provide read counts for spliced and unspliced RNA for all barcodes and UMI's. Each row has the following format: 12-nucleotide barcode, 10-nucleotide UMI, number of spliced reads, number of unspliced reads, and other reads. In addition, we provide the file designed_var_barcode_spliced.csv, which includes spliced and unspliced read counts per barcode for designed variants after demultiplexing UMI's. Samples are indexed as follows: S1 (QCR9), S2 (RPL28), S3 (RPS9B), S4 (RPL7A), S5 (RPS9A), S7 (RPS14B), and S8 (RPL36B).
Library strategy: Targeted RNA-seq
 
Submission date Jun 20, 2023
Last update date Jun 21, 2023
Contact name Rhiju Das
E-mail(s) rhiju@stanford.edu
Organization name Stanford University
Street address 279 Campus Dr, B419 Beckman Center
City PALO ALTO
State/province CA
ZIP/Postal code 94305
Country USA
 
Platform ID GPL27812
Series (1)
GSE209857 RNA structure landscape of S. cerevisiae introns
Relations
BioSample SAMN35810626
SRA SRX20731456

Supplementary file Size Download File type/resource
GSM7501618_barcode_umi_spliced.tar.gz 11.8 Mb (ftp)(http) TAR
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap