Format

Send to

Choose Destination
Sci Rep. 2015 Jul 9;5:10940. doi: 10.1038/srep10940.

Revealing Missing Human Protein Isoforms Based on Ab Initio Prediction, RNA-seq and Proteomics.

Author information

1
1] School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China [2] Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China.
2
1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] School of Biological Sciences, University of Adelaide, SA 5005, Australia [3] School of Medicine, University of Adelaide, North Terrace, Adelaide, SA 5000, Australia [4] School of Pharmacy and Medical Sciences, Division of Health Sciences, University of South Australia, SA, Australia [5] ACRF Cancer Genomics Facility, Centre for Cancer Biology, SA Pathology, Frome Road, Adelaide, SA 5000, Australia.
3
Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China.
4
1] Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China [2] CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China.
5
School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.
6
School of Biological Sciences, University of Adelaide, SA 5005, Australia.
7
1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] Department of Biomedical Informatics (DBMI), Vanderbilt University Medical Center (VUMC), 2525 West End Ave, Suite 800, Nashville, TN 37203, USA.
8
1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] School of Biological Sciences, University of Adelaide, SA 5005, Australia.
9
Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia.
10
1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] School of Biological Sciences, University of Adelaide, SA 5005, Australia [3] School of Medicine, University of Adelaide, North Terrace, Adelaide, SA 5000, Australia.
11
Department of Biomedical Informatics (DBMI), Vanderbilt University Medical Center (VUMC), 2525 West End Ave, Suite 800, Nashville, TN 37203, USA.
12
Institute of Immunology, Second Military Medical University, 800 Xiangyin Road, Shanghai 200433, China.

Abstract

Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.

PMID:
26156868
PMCID:
PMC4496727
DOI:
10.1038/srep10940
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center