Evolutionary Modeling of Genotype-Phenotype Associations, and Application to Primate Coding and Non-coding mtDNA Rate Variation

Evol Bioinform Online. 2013 Jul 28:9:301-16. doi: 10.4137/EBO.S11600. Print 2013.

Abstract

Variation in substitution rates across a phylogeny can be indicative of shifts in the evolutionary dynamics of a protein or non-protein coding regions. One way to understand these signals is to seek the phenotypic correlates of rate variation. Here, we extended a previously published likelihood method designed to detect evolutionary associations between genotypic evolutionary rate and phenotype over a phylogeny. In simulation with two discrete categories of phenotype, the method has a low false-positive rate and detects greater than 80% of true-positives with a tree length of three or greater and a three-fold or greater change in substitution rate given the phenotype. In addition, we successfully extend the test from two to four phenotype categories and evaluated its performance. We then applied the method to two major hypotheses for rate variation in the mitochondrial genome of primates-longevity and generation time as well as body mass which is correlated with many aspects of life history-using three categories of phenotype through discretization of continuous values. Similar to previous results for mammals, we find that the majority of mitochondrial protein-coding genes show associations consistent with the longevity and body mass predictions and that the predominant signal of association comes from the third codon position. We also found a significant association between maximum lifespan and the evolutionary rate of the control region of the mtDNA. In contrast, 24 protein-coding genes from the nuclear genome do not show a consistent pattern of association, which is inconsistent with the generation time hypothesis. These results show the extended method can robustly identify genotype-phenotype associations up to at least four phenotypic categories, and demonstrate the successful application of the method to study factors affecting neutral evolutionary rate in protein-coding and non-coding loci.

Keywords: generation time hypothesis; genotype-phenotype; longevity hypothesis; mitochondria; primate.