Format

Send to

Choose Destination
J Biomol Tech. 2018 Jul;29(2):39-45. doi: 10.7171/jbt.18-2902-003. Epub 2018 Jun 21.

ABRF Proteome Informatics Research Group (iPRG) 2016 Study: Inferring Proteoforms from Bottom-up Proteomics Data.

Author information

1
Pacific Northwest National Laboratory, Richland, Washington 99352, USA.
2
National University of Singapore, 117547 Singapore, Singapore.
3
Agilent Technologies, 121 Hartwell Ave., Lexington, MA 02421.
4
Janssen Research and Development, Spring House, Pennsylvania 19087, USA.
5
Institute for Systems Biology, Seattle, Washington 98109, USA.
6
Science for Life Laboratory, KTH - Royal Institute of Technology, 171 65 Solna, Sweden.
7
Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China.
8
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.
9
Department of Biochemistry and Structural Biology, The University of Texas Health Science Center, San Antonio, Texas 78229, USA; and.
10
Center for Proteomics and Metabolomics, Leiden University Medical Center, 2300 RC Leiden, The Netherlands.

Abstract

This report presents the results from the 2016 Association of Biomolecular Resource Facilities Proteome Informatics Research Group (iPRG) study on proteoform inference and false discovery rate (FDR) estimation from bottom-up proteomics data. For this study, 3 replicate Q Exactive Orbitrap liquid chromatography-tandom mass spectrometry datasets were generated from each of 4 Escherichia coli samples spiked with different equimolar mixtures of small recombinant proteins selected to mimic pairs of homologous proteins. Participants were given raw data and a sequence file and asked to identify the proteins and provide estimates on the FDR at the proteoform level. As part of this study, we tested a new submission system with a format validator running on a virtual private server (VPS) and allowed methods to be provided as executable R Markdown or IPython Notebooks. The task was perceived as difficult, and only eight unique submissions were received, although those who participated did well with no one method performing best on all samples. However, none of the submissions included a complete Markdown or Notebook, even though examples were provided. Future iPRG studies need to be more successful in promoting and encouraging participation. The VPS and submission validator easily scale to much larger numbers of participants in these types of studies. The unique "ground-truth" dataset for proteoform identification generated for this study is now available to the research community, as are the server-side scripts for validating and managing submissions.

KEYWORDS:

best practice; community study; false discovery rate; inference

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center