Send to

Choose Destination
J Proteome Res. 2009 May;8(5):2241-52. doi: 10.1021/pr800678b.

A ranking-based scoring function for peptide-spectrum matches.

Author information

Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, Mail Code 0404 La Jolla, California 92093-0404, USA.


The analysis of the large volume of tandem mass spectrometry (MS/MS) proteomics data that is generated these days relies on automated algorithms that identify peptides from their mass spectra. An essential component of these algorithms is the scoring function used to evaluate the quality of peptide-spectrum matches (PSMs). In this paper, we present new approach to scoring of PSMs. We argue that since this problem is at its core a ranking task (especially in the case of de novo sequencing), it can be solved effectively using machine learning ranking algorithms. We developed a new discriminative boosting-based approach to scoring. Our scoring models draw upon a large set of diverse feature functions that measure different qualities of PSMs. Our method improves the performance of our de novo sequencing algorithm beyond the current state-of-the-art, and also greatly enhances the performance of database search programs. Furthermore, by increasing the efficiency of tag filtration and improving the sensitivity of PSM scoring, we make it practical to perform large-scale MS/MS analysis, such as proteogenomic search of a six-frame translation of the human genome (in which we achieve a reduction of the running time by a factor of 15 and a 60% increase in the number of identified peptides, compared to the InsPecT database search tool). Our scoring function is incorporated into PepNovo+ which is available for download or can be run online at

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for American Chemical Society Icon for PubMed Central
Loading ...
Support Center