 |
 |
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jun 21, 2023 |
Title |
VarsSeqDNA |
Sample type |
SRA |
|
|
Source name |
Library integrated in PLH001
|
Organism |
Saccharomyces cerevisiae |
Characteristics |
strain: Library integrated in PLH001
|
Growth protocol |
Intron variant libraries were integrated into the genome of the S. cerevisiae strain PLH001 using homologous recombination and CRISPR/Cas9-induced double stranded breaks to increase efficiency. Our library included seven sub-libraries with variants of the introns in the following genes: QCR9, RPL28, RPS9B, RPL7A, RPS9A, RPS14B, and RPL36B. Integrated constructs included intron variants in their complete gene context with randomized 12N barcodes upstream of introns in 5' UTR regions. For both DNA and RNA sequencing, overnight cultures were grown from sub-library glycerol stocks at 30 oC in YPD liquid medium, diluted to OD600 0.1, and grown to OD600 0.5-0.6.
|
Extracted molecule |
genomic DNA |
Extraction protocol |
DNA was extracted using the YeaStar Genomic DNA Kit (Zymo Research). We carried out a 7 cycle indexing PCR to amplify target intron library regions (complete variant sequences and barcode sequences) and add i5/i7 indices for demultiplexing sub-libraries in Illumina sequencing. We then amplified our samples for 11-14 PCR cycles with P5/P7 primers to obtain sufficient material for Illumina sequencing, and we size-selected the final library using RNACleanXP beads (Beckman Coulter).
|
|
|
Library strategy |
OTHER |
Library source |
genomic |
Library selection |
other |
Instrument model |
Illumina MiSeq |
|
|
Data processing |
UMI-tools was used to extract randomized barcode sequences by identifying sequences directly downstream of primer binding sites, and paired-end reads were error-corrected using bbmerge.
Paired-end reads were aligned to intron reference sequences using bwa with default settings. Alignments were sorted using fgbio's SortBam and mate-pairs were annotated with fgbio's SetMateInformation.
Reads with the same barcode were grouped using fgbio's GroupReadsByUmi. Then, consensus reads were called using fgbio's CallMolecularConsensusReads with mostly default settings except for the flags -m 20 -M 5.
Consensus reads were quality filtered using fgbio's FilterConsensusReads with the flags -M 5 -q 20 -N 20 -E 0.01. With samtools, the resulting alignment file was sorted and paired-end consensus reads were retrieved. Paired-end reads were then merged with bbmerge using default settings. We kept only the consensus reads with highest coverage (primary consensus sequence) for each barcode, and we required that at most 5% of reads map to secondary consensus sequences. From the resulting consensus sequences, we obtained barcode mappings for designed variants for each of our 7 sub-libraries.
Assembly: S. cerevisiae intron reference sequences were obtained from Talkish, et al. 2019 (doi: 10.1371/journal.pgen.1008249).Coding ORF annotations were obtained from the Saccharomyces Genome Database for the S288C reference genome. Pre-mRNA reference sequences were obtained using the sacCer3 UCSC genome assembly.
Supplementary files format and content: For each sub-library, barcodes and corresponding variants are included along with coverage values. In files ending with conseus_coverage.fa, we provide pairs of barcodes and consensus sequences along with coverage values (number of sequencing reads supporting each consensus sequence). In files ending with complete.fa, we list all designed intron variant sequences for each sub-library, and in files ending with designed_var_barcodes.txt, we provide all barcodes and coverage values corresponding to each designed variant sequence. Samples are indexed as follows: S1 (QCR9), S2 (RPL28), S3 (RPS9B), S4 (RPL7A), S5 (RPS9A), S7 (RPS14B), and S8 (RPL36B).
Library strategy: Targeted gDNA sequencing
|
|
|
Submission date |
Jun 20, 2023 |
Last update date |
Jun 21, 2023 |
Contact name |
Rhiju Das |
E-mail(s) |
rhiju@stanford.edu
|
Organization name |
Stanford University
|
Street address |
279 Campus Dr, B419 Beckman Center
|
City |
PALO ALTO |
State/province |
CA |
ZIP/Postal code |
94305 |
Country |
USA |
|
|
Platform ID |
GPL17143 |
Series (1) |
GSE209857 |
RNA structure landscape of S. cerevisiae introns |
|
Relations |
BioSample |
SAMN35810627 |
SRA |
SRX20731455 |
Supplementary file |
Size |
Download |
File type/resource |
GSM7501617_barcode_variants_coverage.tar.gz |
1.7 Mb |
(ftp)(http) |
TAR |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
 |