Format

Send to

Choose Destination
Genome Res. 2018 May;28(5):609-624. doi: 10.1101/gr.230938.117. Epub 2018 Apr 6.

Recognition of the polycistronic nature of human genes is critical to understanding the genotype-phenotype relationship.

Author information

1
Biochemistry Department, Université de Sherbrooke, Quebec J1E 4K8, Canada.
2
Groupe de recherche PRIMUS, Department of Family and Emergency Medicine, Quebec J1H 5N4, Canada.
3
PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec G1V 0A6, Canada.
4
Pediatric Department, Centre Hospitalier de l'Université de Sherbrooke, Quebec J1H 5N4, Canada.
5
Department of Nuclear Medicine & Radiobiology, Université de Sherbrooke, Quebec J1H 5N4, Canada.

Abstract

Technological advances promise unprecedented opportunities for whole exome sequencing and proteomic analyses of populations. Currently, data from genome and exome sequencing or proteomic studies are searched against reference genome annotations. This provides the foundation for research and clinical screening for genetic causes of pathologies. However, current genome annotations substantially underestimate the proteomic information encoded within a gene. Numerous studies have now demonstrated the expression and function of alternative (mainly small, sometimes overlapping) ORFs within mature gene transcripts. This has important consequences for the correlation of phenotypes and genotypes. Most alternative ORFs are not yet annotated because of a lack of evidence, and this absence from databases precludes their detection by standard proteomic methods, such as mass spectrometry. Here, we demonstrate how current approaches tend to overlook alternative ORFs, hindering the discovery of new genetic drivers and fundamental research. We discuss available tools and techniques to improve identification of proteins from alternative ORFs and finally suggest a novel annotation system to permit a more complete representation of the transcriptomic and proteomic information contained within a gene. Given the crucial challenge of distinguishing functional ORFs from random ones, the suggested pipeline emphasizes both experimental data and conservation signatures. The addition of alternative ORFs in databases will render identification less serendipitous and advance the pace of research and genomic knowledge. This review highlights the urgent medical and research need to incorporate alternative ORFs in current genome annotations and thus permit their inclusion in hypotheses and models, which relate phenotypes and genotypes.

PMID:
29626081
PMCID:
PMC5932603
DOI:
10.1101/gr.230938.117
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center