gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes

Database (Oxford). 2016 May 30:2016:baw087. doi: 10.1093/database/baw087. Print 2016.

Abstract

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computational Biology / methods*
  • Databases, Genetic*
  • Genome / genetics*
  • Mammals / genetics*
  • Open Reading Frames / genetics
  • Phylogeny
  • Viral Proteins / genetics*

Substances

  • Viral Proteins