 |
 |
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jun 21, 2023 |
Title |
VarsSeqRNA |
Sample type |
SRA |
|
|
Source name |
Library integrated in PLH001
|
Organism |
Saccharomyces cerevisiae |
Characteristics |
strain: Library integrated in PLH001
|
Growth protocol |
Intron variant libraries were integrated into the genome of the S. cerevisiae strain PLH001 using homologous recombination and CRISPR/Cas9-induced double stranded breaks to increase efficiency. Our library included seven sub-libraries with variants of the introns in the following genes: QCR9, RPL28, RPS9B, RPL7A, RPS9A, RPS14B, and RPL36B. Integrated constructs included intron variants in their complete gene context with randomized 12N barcodes upstream of introns in 5' UTR regions. For both DNA and RNA sequencing, overnight cultures were grown from sub-library glycerol stocks at 30 oC in YPD liquid medium, diluted to OD600 0.1, and grown to OD600 0.5-0.6.
|
Extracted molecule |
total RNA |
Extraction protocol |
RNA was extracted using the YeaStar RNA Kit (Zymo Research), using 5 μL of Zymolase for 5 mL starting cell culture. We first depleted the extracted RNA of rRNA using RNase H to deplete rRNA with complementary oligos.We used an RNA Clean and Concentrator-5 (Zymo) column with the size-selection protocol to exclude RNA below a size cutoff of 200 nucleotides. RNA was fragmented with Ambion fragmentation reagents, end-repaired withrSAP (NEB), and then ligated to an adenylated universal DNA cloning linker withT4 RNA ligase 2 truncated KQ (NEB). Excess DNA linker was degraded with 5’ Deadenylase (NEB) and RecJf (NEB). We reverse transcribed RNA with TGIRT enzyme (InGex), using a reverse transcription primer that included a 10N interval for unique molecular identifiers (UMIs). cDNA was amplified using targeted PCR primers for each sub-library, first adding i5/i7 indices along with Illumina adapters in a short indexing PCR. We then carried out an additional PCR reaction with P5/P7 primers to generate the final library and size-selected the library using RNACleanXP beads (Beckman Coulter).
|
|
|
Library strategy |
OTHER |
Library source |
transcriptomic |
Library selection |
other |
Instrument model |
Illumina NovaSeq 6000 |
|
|
Data processing |
UMI-tools was used to extract randomized barcode sequences (one for each transformant) and UMI's (one for each cDNA molecule) from sequencing reads. cutadapt was used remove adapter sequences including indexing primer sequences and the universalcloning linker, additionally trimming bases with Q-score lower than 20.
Sequencing reads were aligned to spliced isoform reference sequences for each intron. For reference sequence annotations, we collated all possible isoforms capturing spliced and unspliced transcripts for each gene (4 isoforms for two-intron RPL7A and 2 isoforms for all other genes), and we generated gff annotation files with gmap. TopHat2 was used to align sequencing reads to these reference isoform annotations, using the following flags: --no-novel-juncs -T --bp-mp 3,1 --b2-rdg 5,1 --b2-rfg 5,1 --segment-length 20 --segment-mismatches 3 --read-gap-length 7 --read-edit-dist 50 -m 1 --max-insertion-length 19 --max-deletion-length 19. We generated fasta files from the resulting alignment files with samtools.
Each aligned sequence was classified into one of three categories: unspliced, spliced at the expected junction, or other. We required an exact agreement with 14 nucleotides representing the unspliced junction or spliced junction. For each barcode, we computed the number of spliced and unspliced reads after deduplicating UMI's. For each barcode, we then computed the retained intron fraction (ratio between unspliced and total read counts) along with the normalized mRNA level (ratio between spliced read count and the transformation frequency from DNA sequencing.)
Assembly: S. cerevisiae intron reference sequences were obtained from Talkish, et al. 2019 (doi: 10.1371/journal.pgen.1008249).Coding ORF annotations were obtained from the Saccharomyces Genome Database for the S288C reference genome. Pre-mRNA reference sequences were obtained using the sacCer3 UCSC genome assembly.
Supplementary files format and content: For each sub-library, in files ending with barcode_umi_spliced.txt we provide read counts for spliced and unspliced RNA for all barcodes and UMI's. Each row has the following format: 12-nucleotide barcode, 10-nucleotide UMI, number of spliced reads, number of unspliced reads, and other reads. In addition, we provide the file designed_var_barcode_spliced.csv, which includes spliced and unspliced read counts per barcode for designed variants after demultiplexing UMI's. Samples are indexed as follows: S1 (QCR9), S2 (RPL28), S3 (RPS9B), S4 (RPL7A), S5 (RPS9A), S7 (RPS14B), and S8 (RPL36B).
Library strategy: Targeted RNA-seq
|
|
|
Submission date |
Jun 20, 2023 |
Last update date |
Jun 21, 2023 |
Contact name |
Rhiju Das |
E-mail(s) |
rhiju@stanford.edu
|
Organization name |
Stanford University
|
Street address |
279 Campus Dr, B419 Beckman Center
|
City |
PALO ALTO |
State/province |
CA |
ZIP/Postal code |
94305 |
Country |
USA |
|
|
Platform ID |
GPL27812 |
Series (1) |
GSE209857 |
RNA structure landscape of S. cerevisiae introns |
|
Relations |
BioSample |
SAMN35810626 |
SRA |
SRX20731456 |
Supplementary file |
Size |
Download |
File type/resource |
GSM7501618_barcode_umi_spliced.tar.gz |
11.8 Mb |
(ftp)(http) |
TAR |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
 |