Format

Send to

Choose Destination
Proc Natl Acad Sci U S A. 2014 Jul 8;111(27):9869-74. doi: 10.1073/pnas.1400447111. Epub 2014 Jun 24.

Defining a personal, allele-specific, and single-molecule long-read transcriptome.

Author information

1
Department of Genetics, Stanford University, Stanford, CA 94305-5120; and.
2
Department of Genetics, Stanford University, Stanford, CA 94305-5120; andDepartment of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06511.
3
Department of Genetics, Stanford University, Stanford, CA 94305-5120; and mpsnyder@stanford.edu.

Abstract

Personal transcriptomes in which all of an individual's genetic variants (e.g., single nucleotide variants) and transcript isoforms (transcription start sites, splice sites, and polyA sites) are defined and quantified for full-length transcripts are expected to be important for understanding individual biology and disease, but have not been described previously. To obtain such transcriptomes, we sequenced the lymphoblastoid transcriptomes of three family members (GM12878 and the parents GM12891 and GM12892) by using a Pacific Biosciences long-read approach complemented with Illumina 101-bp sequencing and made the following observations. First, we found that reads representing all splice sites of a transcript are evident for most sufficiently expressed genes ≤3 kb and often for genes longer than that. Second, we added and quantified previously unidentified splicing isoforms to an existing annotation, thus creating the first personalized annotation to our knowledge. Third, we determined SNVs in a de novo manner and connected them to RNA haplotypes, including HLA haplotypes, thereby assigning single full-length RNA molecules to their transcribed allele, and demonstrated Mendelian inheritance of RNA molecules. Fourth, we show how RNA molecules can be linked to personal variants on a one-by-one basis, which allows us to assess differential allelic expression (DAE) and differential allelic isoforms (DAI) from the phased full-length isoform reads. The DAI method is largely independent of the distance between exon and SNV--in contrast to fragmentation-based methods. Overall, in addition to improving eukaryotic transcriptome annotation, these results describe, to our knowledge, the first large-scale and full-length personal transcriptome.

KEYWORDS:

allele-specific expression; alternative splicing; isoform sequencing; personalized medicine; platform comparison

PMID:
24961374
PMCID:
PMC4103364
DOI:
10.1073/pnas.1400447111
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center