Phylogenetic Clustering of Genes Reveals Shared Evolutionary Trajectories and Putative Gene Functions

Genome Biol Evol. 2018 Sep 1;10(9):2255-2265. doi: 10.1093/gbe/evy178.

Abstract

Homologous genes in prokaryotes can be described using phylogenetic profiles which summarize their patterns of presence or absence across a set of genomes. Phylogenetic profiles have been used for nearly twenty years to cluster genes based on measures such as the Euclidean distance between profile vectors. However, most approaches do not take into account the phylogenetic relationships amongst the profiled genomes, and overrepresentation of certain taxonomic groups (i.e., pathogenic species with many sequenced representatives) can skew the interpretation of profiles. We propose a new approach that uses a coevolutionary method defined by Pagel to account for the phylogenetic relationships amongst target organisms, and a hierarchical-clustering approach to define sets of genes with common distributions across the organisms. The clusters we obtain using our method show greater evidence of phylogenetic and functional clustering than a recently published approach based on hidden Markov models. Our clustering method identifies sets of amino-acid biosynthesis genes that constitute cohesive pathways, and motility/chemotaxis genes with common histories of descent and lateral gene transfer.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics
  • Biological Coevolution
  • Clostridiales / genetics*
  • Cluster Analysis
  • Evolution, Molecular*
  • Gene Transfer, Horizontal
  • Genome, Bacterial*
  • Markov Chains
  • Multigene Family
  • Phylogeny*

Substances

  • Bacterial Proteins