NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM7501617 Query DataSets for GSM7501617
Status Public on Jun 21, 2023
Title VarsSeqDNA
Sample type SRA
 
Source name Library integrated in PLH001
Organism Saccharomyces cerevisiae
Characteristics strain: Library integrated in PLH001
Growth protocol Intron variant libraries were integrated into the genome of the S. cerevisiae strain PLH001 using homologous recombination and CRISPR/Cas9-induced double stranded breaks to increase efficiency. Our library included seven sub-libraries with variants of the introns in the following genes: QCR9, RPL28, RPS9B, RPL7A, RPS9A, RPS14B, and RPL36B. Integrated constructs included intron variants in their complete gene context with randomized 12N barcodes upstream of introns in 5' UTR regions. For both DNA and RNA sequencing, overnight cultures were grown from sub-library glycerol stocks at 30 oC in YPD liquid medium, diluted to OD600 0.1, and grown to OD600 0.5-0.6.
Extracted molecule genomic DNA
Extraction protocol DNA was extracted using the YeaStar Genomic DNA Kit (Zymo Research).
We carried out a 7 cycle indexing PCR to amplify target intron library regions (complete variant sequences and barcode sequences) and add i5/i7 indices for demultiplexing sub-libraries in Illumina sequencing. We then amplified our samples for 11-14 PCR cycles with P5/P7 primers to obtain sufficient material for Illumina sequencing, and we size-selected the final library using RNACleanXP beads (Beckman Coulter).
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina MiSeq
 
Data processing UMI-tools was used to extract randomized barcode sequences by identifying sequences directly downstream of primer binding sites, and paired-end reads were error-corrected using bbmerge.
Paired-end reads were aligned to intron reference sequences using bwa with default settings. Alignments were sorted using fgbio's SortBam and mate-pairs were annotated with fgbio's SetMateInformation.
Reads with the same barcode were grouped using fgbio's GroupReadsByUmi. Then, consensus reads were called using fgbio's CallMolecularConsensusReads with mostly default settings except for the flags -m 20 -M 5.
Consensus reads were quality filtered using fgbio's FilterConsensusReads with the flags -M 5 -q 20 -N 20 -E 0.01. With samtools, the resulting alignment file was sorted and paired-end consensus reads were retrieved. Paired-end reads were then merged with bbmerge using default settings. We kept only the consensus reads with highest coverage (primary consensus sequence) for each barcode, and we required that at most 5% of reads map to secondary consensus sequences. From the resulting consensus sequences, we obtained barcode mappings for designed variants for each of our 7 sub-libraries.
Assembly: S. cerevisiae intron reference sequences were obtained from Talkish, et al. 2019 (doi: 10.1371/journal.pgen.1008249).Coding ORF annotations were obtained from the Saccharomyces Genome Database for the S288C reference genome. Pre-mRNA reference sequences were obtained using the sacCer3 UCSC genome assembly.
Supplementary files format and content: For each sub-library, barcodes and corresponding variants are included along with coverage values. In files ending with conseus_coverage.fa, we provide pairs of barcodes and consensus sequences along with coverage values (number of sequencing reads supporting each consensus sequence). In files ending with complete.fa, we list all designed intron variant sequences for each sub-library, and in files ending with designed_var_barcodes.txt, we provide all barcodes and coverage values corresponding to each designed variant sequence. Samples are indexed as follows: S1 (QCR9), S2 (RPL28), S3 (RPS9B), S4 (RPL7A), S5 (RPS9A), S7 (RPS14B), and S8 (RPL36B).
Library strategy: Targeted gDNA sequencing
 
Submission date Jun 20, 2023
Last update date Jun 21, 2023
Contact name Rhiju Das
E-mail(s) rhiju@stanford.edu
Organization name Stanford University
Street address 279 Campus Dr, B419 Beckman Center
City PALO ALTO
State/province CA
ZIP/Postal code 94305
Country USA
 
Platform ID GPL17143
Series (1)
GSE209857 RNA structure landscape of S. cerevisiae introns
Relations
BioSample SAMN35810627
SRA SRX20731455

Supplementary file Size Download File type/resource
GSM7501617_barcode_variants_coverage.tar.gz 1.7 Mb (ftp)(http) TAR
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap