Format

Send to

Choose Destination
Nat Methods. 2017 Sep;14(9):921-927. doi: 10.1038/nmeth.4398. Epub 2017 Aug 21.

Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses.

Author information

1
Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
2
PhD Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, Switzerland.
3
ID Scientific IT Services, ETH Zurich, Zurich, Switzerland.
4
PhD program in Molecular and Translational Biomedicine, Competence Center Personalized Medicine (CC-PM), ETH Zurich and University of Zurich, Zurich, Switzerland.
5
SCIEX, Redwood City, California, USA.
6
Department of Genome Sciences, University of Washington, Seattle, Washington, USA.
7
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.
8
Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA.
9
Biognosys, Schlieren, Switzerland.
10
SCIEX, Concord, Ontario, Canada.
11
Faculty of Science, University of Zurich, Zurich, Switzerland.

Abstract

Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is the main method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, as exemplified by the technique SWATH-MS, has emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale data sets. We demonstrate that statistical concepts developed for discovery proteomics based on spectrum-centric scoring can be adapted to large-scale DIA experiments that have been analyzed with peptide-centric scoring strategies, and we provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the spectral library. We propose the application of a global analyte constraint to prevent the accumulation of false positives across large-scale data sets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for the detected peptide queries, peptides and inferred proteins.

PMID:
28825704
PMCID:
PMC5581544
DOI:
10.1038/nmeth.4398
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center