Format

Send to

Choose Destination
Nucleic Acids Res. 2015 Mar 11;43(5):e29. doi: 10.1093/nar/gku1283. Epub 2014 Dec 15.

PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration.

Author information

1
Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium.
2
Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium Department of Medical Protein Research, Flemish Institute of Biotechnology, Ghent, Belgium Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.
3
Department of Medical Protein Research, Flemish Institute of Biotechnology, Ghent, Belgium Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.
4
Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium gerben.menschaert@ugent.be.

Abstract

An increasing amount of studies integrate mRNA sequencing data into MS-based proteomics to complement the translation product search space. However, several factors, including extensive regulation of mRNA translation and the need for three- or six-frame-translation, impede the use of mRNA-seq data for the construction of a protein sequence search database. With that in mind, we developed the PROTEOFORMER tool that automatically processes data of the recently developed ribosome profiling method (sequencing of ribosome-protected mRNA fragments), resulting in genome-wide visualization of ribosome occupancy. Our tool also includes a translation initiation site calling algorithm allowing the delineation of the open reading frames (ORFs) of all translation products. A complete protein synthesis-based sequence database can thus be compiled for mass spectrometry-based identification. This approach increases the overall protein identification rates with 3% and 11% (improved and new identifications) for human and mouse, respectively, and enables proteome-wide detection of 5'-extended proteoforms, upstream ORF translation and near-cognate translation start sites. The PROTEOFORMER tool is available as a stand-alone pipeline and has been implemented in the galaxy framework for ease of use.

PMID:
25510491
PMCID:
PMC4357689
DOI:
10.1093/nar/gku1283
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center