Send to

Choose Destination
J Proteome Res. 2016 Nov 4;15(11):4101-4115. Epub 2016 Sep 15.

Detection of Missing Proteins Using the PRIDE Database as a Source of Mass Spectrometry Evidence.

Author information

Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain.
IdiSNA, Navarra Institute for Health Research , 31008, Pamplona, Spain.
Proteomics Unit, Spanish National Cancer Research Centre , 28029, Madrid, Spain.
European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust GenomeCampus, Hinxton, Cambridge, CB10 1SD, U.K.
Proteomics Unit (SCSIE), University of Valencia , 46010, Valencia, Spain.
Andrology Laboratory and Sperm Bank, Instituto Universitario IVI , 46015, Valencia, Spain.
Fundación IVI/INCLIVA , 46010, Valencia, Spain.
Biochemistry Department, University of Valencia , 46010, Valencia, Spain.
Division of Hepatology and Gene Therapy, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain.


The current catalogue of the human proteome is not yet complete, as experimental proteomics evidence is still elusive for a group of proteins known as the missing proteins. The Human Proteome Project (HPP) has been successfully using technology and bioinformatic resources to improve the characterization of such challenging proteins. In this manuscript, we propose a pipeline starting with the mining of the PRIDE database to select a group of data sets potentially enriched in missing proteins that are subsequently analyzed for protein identification with a method based on the statistical analysis of proteotypic peptides. Spermatozoa and the HEK293 cell line were found to be a promising source of missing proteins and clearly merit further attention in future studies. After the analysis of the selected samples, we found 342 PSMs, suggesting the presence of 97 missing proteins in human spermatozoa or the HEK293 cell line, while only 36 missing proteins were potentially detected in the retina, frontal cortex, aorta thoracica, or placenta. The functional analysis of the missing proteins detected confirmed their tissue specificity, and the validation of a selected set of peptides using targeted proteomics (SRM/MRM assays) further supports the utility of the proposed pipeline. As illustrative examples, DNAH3 and TEPP in spermatozoa, and UNCX and ATAD3C in HEK293 cells were some of the more robust and remarkable identifications in this study. We provide evidence indicating the relevance to carefully analyze the ever-increasing MS/MS data available from PRIDE and other repositories as sources for missing proteins detection in specific biological matrices as revealed for HEK293 cells.


C-HPP; MS/MS proteomics; PRIDE database; missing proteins

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for American Chemical Society Icon for PubMed Central
Loading ...
Support Center