Format

Send to

Choose Destination
See comment in PubMed Commons below
Nat Methods. 2014 Nov;11(11):1114-25. doi: 10.1038/nmeth.3144.

Proteogenomics: concepts, applications and computational strategies.

Author information

1
1] Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA. [2] Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.

Abstract

Proteogenomics is an area of research at the interface of proteomics and genomics. In this approach, customized protein sequence databases generated using genomic and transcriptomic information are used to help identify novel peptides (not present in reference protein sequence databases) from mass spectrometry-based proteomic data; in turn, the proteomic data can be used to provide protein-level evidence of gene expression and to help refine gene models. In recent years, owing to the emergence of new sequencing technologies such as RNA-seq and dramatic improvements in the depth and throughput of mass spectrometry-based proteomics, the pace of proteogenomic research has greatly accelerated. Here I review the current state of proteogenomic methods and applications, including computational strategies for building and using customized protein sequence databases. I also draw attention to the challenge of false positive identifications in proteogenomics and provide guidelines for analyzing the data and reporting the results of proteogenomic studies.

PMID:
25357241
PMCID:
PMC4392723
DOI:
10.1038/nmeth.3144
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Nature Publishing Group Icon for PubMed Central
    Loading ...
    Support Center