Format

Send to

Choose Destination
Sci Data. 2017 Dec 19;4:170194. doi: 10.1038/sdata.2017.194.

Long-read sequencing of the human cytomegalovirus transcriptome with the Pacific Biosciences RSII platform.

Author information

1
Department of Medical Biology, Faculty of Medicine, University of Szeged, Szeged 6720, Hungary.
2
Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA.

Abstract

Long-read RNA sequencing allows for the precise characterization of full-length transcripts, which makes it an indispensable tool in transcriptomics. The human cytomegalovirus (HCMV) genome has been first sequenced in 1989 and although short-read sequencing studies have uncovered much of the complexity of its transcriptome, only few of its transcripts have been fully annotated. We hereby present a long-read RNA sequencing dataset of HCMV infected human lung fibroblast cells sequenced by the Pacific Biosciences RSII platform. Seven SMRT cells were sequenced using oligo(dT) primers to reverse transcribe poly(A)-selected RNA molecules and one library was prepared using random primers for the reverse transcription of the rRNA-depleted sample. Our dataset contains 122,636 human and 33,086 viral (HMCV strain Towne) reads. The described data include raw and processed sequencing files, and combined with other datasets, they can be used to validate transcriptome analysis tools, to compare library preparation methods, to test base calling algorithms or to identify genetic variants.

PMID:
29257134
PMCID:
PMC5735922
DOI:
10.1038/sdata.2017.194
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center