Send to

Choose Destination
J Proteome Res. 2015 Sep 4;14(9):3680-92. doi: 10.1021/acs.jproteome.5b00481. Epub 2015 Jul 15.

Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples.

Author information

State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Engineering Research Center for Protein Drugs, National Center for Protein Sciences, Beijing Institute of Radiation Medicine , Beijing 102206, China.
Institute of Microbiology, Chinese Academy of Science , Beijing 100101, China.
Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education , and Wuhan University School of Pharmaceutical Sciences , Wuhan 430071, China.
Anhui Medical University , Hefei 230032, Anhui China.
Life Science College, Southwest Forestry University , Kunming 650224, China.
State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning 530005, China.
BGI-Shenzhen , Shenzhen 518083, China.
Inner Mongolia Medical University , Hohhot 010110, Inner Mongolia China.
Institute of Biomedical Sciences, Department of Chemistry, and Zhongshan Hospital, Fudan University , 130 DongAn Road, Shanghai 200032, China.


As part of the Chromosome-Centric Human Proteome Project (C-HPP) mission, laboratories all over the world have tried to map the entire missing proteins (MPs) since 2012. On the basis of the first and second Chinese Chromosome Proteome Database (CCPD 1.0 and 2.0) studies, we developed systematic enrichment strategies to identify MPs that fell into four classes: (1) low molecular weight (LMW) proteins, (2) membrane proteins, (3) proteins that contained various post-translational modifications (PTMs), and (4) nucleic acid-associated proteins. Of 8845 proteins identified in 7 data sets, 79 proteins were classified as MPs. Among data sets derived from different enrichment strategies, data sets for LMW and PTM yielded the most novel MPs. In addition, we found that some MPs were identified in multiple-data sets, which implied that tandem enrichments methods might improve the ability to identify MPs. Moreover, low expression at the transcription level was the major cause of the "missing" of these MPs; however, MPs with higher expression level also evaded identification, most likely due to other characteristics such as LMW, high hydrophobicity and PTM. By combining a stringent manual check of the MS2 spectra with peptides synthesis verification, we confirmed 30 MPs (neXtProt PE2 ∼ PE4) and 6 potential MPs (neXtProt PE5) with authentic MS evidence. By integrating our large-scale data sets of CCPD 2.0, the number of identified proteins has increased considerably beyond simulation saturation. Here, we show that special enrichment strategies can break through the data saturation bottleneck, which could increase the efficiency of MP identification in future C-HPP studies. All 7 data sets have been uploaded to ProteomeXchange with the identifier PXD002255.


chromosome-centric human proteome project; enrichment strategies; missing protein; proteome

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center