Format

Send to

Choose Destination
Methods Mol Biol. 2019;1851:63-81. doi: 10.1007/978-1-4939-8736-8_4.

Computational Prediction of De Novo Emerged Protein-Coding Genes.

Author information

1
Department of Genetics, Trinity College Dublin, Smurfit Institute of Genetics, University of Dublin, Dublin, Ireland. vakirlisnikos@gmail.com.
2
Department of Genetics, Trinity College Dublin, Smurfit Institute of Genetics, University of Dublin, Dublin, Ireland.

Abstract

De novo genes, that is, protein-coding genes originating from previously noncoding sequence, have gone from being considered impossibly unlikely to being recognized as an important source of genetic novelty in eukaryotic genomes. It is clear that de novo gene evolution is a rare but consistent feature of eukaryotic genomes, being detected in every genome studied. However, different studies often use different computational methods, and the numbers and identities of the detected genes vary greatly. Here we present a coherent protocol for the computational identification of de novo genes by comparative genomics. The method described uses homology searches, identification of syntenic regions, and ancestral sequence reconstruction to produce high-confidence candidates with robust evidence of de novo emergence. It is designed to be easily applicable given the basic knowledge of bioinformatic tools and scalable so that it can be applied on large and small datasets.

KEYWORDS:

De novo genes; Gene birth; Genome evolution; Genome-wide detection; New gene evolution; Novel genes; ORF formation; Protein-coding genes

PMID:
30298392
DOI:
10.1007/978-1-4939-8736-8_4
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center