Send to

Choose Destination
J Proteome Res. 2018 Oct 5;17(10):3431-3444. doi: 10.1021/acs.jproteome.8b00310. Epub 2018 Sep 6.

Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq.

Author information

Department of Psychiatry , Yale School of Medicine, Connecticut Mental Health Center , 34 Park Street , New Haven , Connecticut 06519 , United States.
Department of Molecular Biophysics & Biochemistry , Yale School of Medicine , P.O. Box 208114, New Haven , Connecticut 06520 , United States.
Yale/NIDA Neuroproteomics Center , Yale School of Medicine , 300 George Street , New Haven , Connecticut 06510 , United States.
W.M. Keck Biotechnology Resource Laboratory , Yale School of Medicine , 300 George Street , New Haven , Connecticut 06510 , United States.
Department of Neuroscience and Kavli Institute for Neuroscience, Departments of Genetics and Psychiatry, Section of Comparative Medicine, and Yale Child Study Center, Program in Cellular Neuroscience, Neurodegeneration and Repair , Yale School of Medicine , New Haven , Connecticut 06510 , United States.


Cellular control of gene expression is a complex process that is subject to multiple levels of regulation, but ultimately it is the protein produced that determines the biosynthetic state of the cell. One way that a cell can regulate the protein output from each gene is by expressing alternate isoforms with distinct amino acid sequences. These isoforms may exhibit differences in localization and binding interactions that can have profound functional implications. High-throughput liquid chromatography tandem mass spectrometry proteomics (LC-MS/MS) relies on enzymatic digestion and has lower coverage and sensitivity than transcriptomic profiling methods such as RNA-seq. Digestion resultsĀ in predictable fragmentation of a protein, which can limit the generation of peptides capable of distinguishing between isoforms. Here we exploit transcript-level expression from RNA-seq to set prior likelihoods and enable protein isoform abundances to be directly estimated from LC-MS/MS, an approach derived from the principle that most genes appear to be expressed as a single dominant isoform in a given cell type or tissue. Through this deep integration of RNA-seq and LC-MS/MS data from the same sample, we show that a principal isoform can be identified in >80% of gene products in homogeneous HEK293 cell culture and >70% of proteins detected in complex human brain tissue. We demonstrate that the incorporation of translatome data from ribosome profiling further refines this process. Defining isoforms in experiments with matched RNA-seq/translatome and proteomic data increases the functional relevance of such data sets and will further broaden our understanding of multilevel control of gene expression.


HEK293; RNA-seq; brain; expectation maximization; integrative analysis; isoforms; mass spectrometry; peptides; proteogenomics; ribosome profiling

[Available on 2019-10-05]

Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center