Send to

Choose Destination
See comment in PubMed Commons below
Anal Chem. 2012 Sep 18;84(18):7919-25. doi: 10.1021/ac301674y. Epub 2012 Aug 31.

Implicit biology in peptide spectral libraries.

Author information

  • 1Department of Biological Chemistry, Hebrew University of Jerusalem, Israel.


Mass spectral libraries are collections of mass spectra curated specifically to facilitate the identification of small molecules, metabolites, and short peptides. One of the most comprehensive peptide spectral libraries is curated by NIST and contains upward of half a million annotated spectra dominated by human and model organisms including budding yeast and mouse. While motivated primarily by the technological goal of increasing sensitivity and specificity in spectral identification, we have found that the NIST spectral library constitutes a surprisingly rich source of biological knowledge. In this Article, we show that data-mining of these published libraries while applying strict empirical thresholds yields many characteristics of protein biology. In particular, we demonstrate that the size and increasingly comprehensive nature of these libraries, generated from whole-proteome digests, enables inference from the presence but crucially also from the absence of spectra for individual peptides. We illustrate implicit biological trends that lead to significant absence of spectra accounted for by complex post-translational modifications and overlooked proteolytic sites. We conclude that many subtle biological signatures such as genetic variants, regulated proteolysis, and post-translational modifications are exposed through the systematic mining of spectral collections originally compiled as general-purpose, technology-oriented resources.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for American Chemical Society
    Loading ...
    Support Center