PlantOrDB: a genome-wide ortholog database for land plants and green algae

BMC Plant Biol. 2015 Jun 26:15:161. doi: 10.1186/s12870-015-0531-4.

Abstract

Background: Genes with different functions are originally generated from some ancestral genes by gene duplication, mutation and functional recombination. It is widely accepted that orthologs are homologous genes evolved from speciation events while paralogs are homologous genes resulted from gene duplication events.With the rapid increase of genomic data, identifying and distinguishing these genes among different species is becoming an important part of functional genomics research.

Description: Using 35 plant and 6 green algal genomes from Phytozome v9, we clustered 1,291,670 peptide sequences into 49,355 homologous gene families in terms of sequence similarity. For each gene family, we have generated a peptide sequence alignment and phylogenetic tree, and identified the speciation/duplication events for every node within the tree. For each node, we also identified and highlighted diagnostic characters that facilitate appropriate addition of a new query sequence into the existing phylogenetic tree and sequence alignment of its best matched gene family. Based on a desired species or subgroup of all species, users can view the phylogenetic tree, sequence alignment and diagnostic characters for a given gene family selectively. PlantOrDB not only allows users to identify orthologs or paralogs from phylogenetic trees, but also provides all orthologs that are built using Reciprocal Best Hit (RBH) pairwise alignment method. Users can upload their own sequences to find the best matched gene families, and visualize their query sequences within the relevant phylogenetic trees and sequence alignments.

Conclusion: PlantOrDB ( http://bioinfolab.miamioh.edu/plantordb ) is a genome-wide ortholog database for land plants and green algae. PlantOrDB offers highly interactive visualization, accurate query classification and powerful search functions useful for functional genomic research.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algal Proteins / chemistry
  • Algal Proteins / genetics*
  • Algal Proteins / metabolism
  • Amino Acid Sequence
  • Chlorophyta / genetics*
  • Chlorophyta / metabolism
  • Databases, Nucleic Acid / organization & administration*
  • Embryophyta / genetics*
  • Embryophyta / metabolism
  • Evolution, Molecular
  • Genome, Plant*
  • Phylogeny
  • Plant Proteins / chemistry
  • Plant Proteins / genetics*
  • Plant Proteins / metabolism
  • Sequence Alignment

Substances

  • Algal Proteins
  • Plant Proteins