Send to

Choose Destination
Proteomics. 2015 Aug;15(15):2580-91. doi: 10.1002/pmic.201400620. Epub 2015 May 28.

EBprot: Statistical analysis of labeling-based quantitative proteomics data.

Author information

Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore.
Institute of Molecular and Cell Biology, A*STAR, Singapore.
Department of Pathology, Yale University, New Haven, CT, USA.
Yong Loo Lin School of Medicine, National University of Singapore, Singapore.


Labeling-based proteomics is a powerful method for detection of differentially expressed proteins (DEPs). The current data analysis platform typically relies on protein-level ratios, which is obtained by summarizing peptide-level ratios for each protein. In shotgun proteomics, however, some proteins are quantified with more peptides than others, and this reproducibility information is not incorporated into the differential expression (DE) analysis. Here, we propose a novel probabilistic framework EBprot that directly models the peptide-protein hierarchy and rewards the proteins with reproducible evidence of DE over multiple peptides. To evaluate its performance with known DE states, we conducted a simulation study to show that the peptide-level analysis of EBprot provides better receiver-operating characteristic and more accurate estimation of the false discovery rates than the methods based on protein-level ratios. We also demonstrate superior classification performance of peptide-level EBprot analysis in a spike-in dataset. To illustrate the wide applicability of EBprot in different experimental designs, we applied EBprot to a dataset for lung cancer subtype analysis with biological replicates and another dataset for time course phosphoproteome analysis of EGF-stimulated HeLa cells with multiplexed labeling. Through these examples, we show that the peptide-level analysis of EBprot is a robust alternative to the existing statistical methods for the DE analysis of labeling-based quantitative datasets. The software suite is freely available on the Sourceforge website All MS data have been deposited in the ProteomeXchange with identifier PXD001426 (


Bioinformatics; Differential expression; Hierarchical mixture model; Quantitative analysis; Stable isotope labeling

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center