Format

Send to

Choose Destination
Mol Phylogenet Evol. 2006 Aug;40(2):428-34. Epub 2006 May 2.

Phylogenetic estimation under codon models can be biased by codon usage heterogeneity.

Author information

1
Center for Computational Sciences, Institute of Biological Sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan. yuji@ccs.tsukuba.ac.jp

Abstract

In theory, codon models that account for the dependence of nucleotide substitutions between codon positions as well as differences between synonymous and non-synonymous changes best describe the sequence evolution in protein coding genes. However, in practice we know little about the degree to which violations of the assumptions of codon model-based estimates occur, and how significant these artifacts may be. In nucleotide-based phylogenies from first and second codon positions in a concatenated plastid gene data set, two distantly related taxa--dinoflagellate and haptophyte plastids--were robustly grouped together. This artifactual grouping is attributed to the parallel heterogeneity in leucine (Leu) and serine (Ser) codon usages in the data set. Here, by using this data set, we demonstrated that codon-based phylogenetic estimations are seriously biased, robustly uniting the dinoflagellate and haptophyte plastids into a monophyletic clade, when the model assumption of homogeneity of codon composition was violated. Our results suggest that similar phylogenetic artifacts may occur via codon usage heterogeneity in any amino acids in codon model-based estimations. We advise that homogeneity in codon usage across taxa in a data set be confirmed before codon model-based phylogenetic estimation is attempted.

PMID:
16647273
DOI:
10.1016/j.ympev.2006.03.020
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center