BuildSummary: using a group-based approach to improve the sensitivity of peptide/protein identification in shotgun proteomics

J Proteome Res. 2012 Mar 2;11(3):1494-502. doi: 10.1021/pr200194p. Epub 2012 Feb 8.

Abstract

The target-decoy database search strategy is widely accepted as a standard method for estimating the false discovery rate (FDR) of peptide identification, based on which peptide-spectrum matches (PSMs) from the target database are filtered. To improve the sensitivity of protein identification given a fixed accuracy (frequently defined by a protein FDR threshold), a postprocessing procedure is often used that integrates results from different peptide search engines that had assayed the same data set. In this work, we show that PSMs that are grouped by the precursor charge, the number of missed internal cleavage sites, the modification state, and the numbers of protease termini and that the proteins grouped by their unique peptide count should be filtered separately according to the given FDR. We also develop an iterative procedure to filter the PSMs and proteins simultaneously, according to the given FDR. Finally, we present a general framework to integrate the results from different peptide search engines using the same FDR threshold. Our method was tested with several shotgun proteomics data sets that were acquired by multiple LC/MS instruments from two different biological samples. The results showed a satisfactory performance. We implemented the method in a user-friendly software package called BuildSummary, which can be downloaded for free from http://www.proteomics.ac.cn/software/proteomicstools/index.htm as part of the software suite ProteomicsTools.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Data Interpretation, Statistical
  • Databases, Protein
  • Humans
  • Mice
  • Peptide Fragments / chemistry
  • Peptide Mapping / methods*
  • Peptide Mapping / standards
  • Proteolysis
  • Proteome / chemistry*
  • Proteomics / methods*
  • Search Engine
  • Software*
  • Tandem Mass Spectrometry

Substances

  • Peptide Fragments
  • Proteome