The tangled bank of amino acids

Protein Sci. 2016 Jul;25(7):1354-62. doi: 10.1002/pro.2930. Epub 2016 May 12.

Abstract

The use of amino acid substitution matrices to model protein evolution has yielded important insights into both the evolutionary process and the properties of specific protein families. In order to make these models tractable, standard substitution matrices represent the average results of the evolutionary process rather than the underlying molecular biophysics and population genetics, treating proteins as a set of independently evolving sites rather than as an integrated biomolecular entity. With advances in computing and the increasing availability of sequence data, we now have an opportunity to move beyond current substitution matrices to more interpretable mechanistic models with greater fidelity to the evolutionary process of mutation and selection and the holistic nature of the selective constraints. As part of this endeavour, we consider how epistatic interactions induce spatial and temporal rate heterogeneity, and demonstrate how these generally ignored factors can reconcile standard substitution rate matrices and the underlying biology, allowing us to better understand the meaning of these substitution rates. Using computational simulations of protein evolution, we can demonstrate the importance of both spatial and temporal heterogeneity in modelling protein evolution.

Keywords: epistasis; epistatic interactions; evolutionary Stokes shift; evolutionary process; molecular evolution; phylogenetics; protein evolution; substitution matrices; substitution rates.

MeSH terms

  • Amino Acid Substitution*
  • Computer Simulation
  • Databases, Protein
  • Evolution, Molecular
  • Models, Genetic
  • Mutation Rate
  • Phylogeny
  • Proteins / chemistry
  • Proteins / genetics*
  • Selection, Genetic

Substances

  • Proteins