Format

Send to

Choose Destination

See 1 citation found by title matching your search:

J Proteomics. 2016 Oct 21;149:7-14. doi: 10.1016/j.jprot.2016.08.005. Epub 2016 Aug 13.

Why are they missing? : Bioinformatics characterization of missing human proteins.

Author information

1
Biofluid Biomarker Center, Institute of Social innovation and Co-operation, Niigata University, Niigata 951-2181, Japan; Biotechnology Department, Faculty of Agriculture, Al-Azhar University, Cairo 11682, Egypt.
2
Biofluid Biomarker Center, Institute of Social innovation and Co-operation, Niigata University, Niigata 951-2181, Japan; Department of Physiology, Faculty of Veterinary Medicine, Suez Canal University, Ismailia 41522, Egypt.
3
Biofluid Biomarker Center, Institute of Social innovation and Co-operation, Niigata University, Niigata 951-2181, Japan.
4
Biotechnology Department, Faculty of Agriculture, Al-Azhar University, Cairo 11682, Egypt.
5
Biofluid Biomarker Center, Institute of Social innovation and Co-operation, Niigata University, Niigata 951-2181, Japan. Electronic address: tadashiy-bbc@ccr.niigata-u.ac.jp.

Abstract

NeXtProt is a web-based protein knowledge platform that supports research on human proteins. NeXtProt (release 2015-04-28) lists 20,060 proteins, among them, 3373 canonical proteins (16.8%) lack credible experimental evidence at protein level (PE2:PE5). Therefore, they are considered as "missing proteins". A comprehensive bioinformatic workflow has been proposed to analyze these "missing" proteins. The aims of current study were to analyze physicochemical properties, existence and distribution of the tryptic cleavage sites, and to pinpoint the signature peptides of the missing proteins. Our findings showed that 23.7% of missing proteins were hydrophobic proteins possessing transmembrane domains (TMD). Also, forty missing entries generate tryptic peptides were either out of mass detection range (>30aa) or mapped to different proteins (<9aa). Additionally, 21% of missing entries didn't generate any unique tryptic peptides. In silico endopeptidase combination strategy increased the possibility of missing proteins identification. Coherently, using both mature protein database and signal peptidome database could be a promising option to identify some missing proteins by targeting their unique N-terminal tryptic peptide from mature protein database and or C-terminus tryptic peptide from signal peptidome database. In conclusion, Identification of missing protein requires additional consideration during sample preparation, extraction, digestion and data analysis to increase its incidence of identification.

KEYWORDS:

Bioinformatics; Missing protein; Signal peptidome; Transmembrane domain

PMID:
27535355
DOI:
10.1016/j.jprot.2016.08.005
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center