Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Proc Natl Acad Sci U S A. 2005 Mar 22;102(12):4436-41. Epub 2005 Mar 11.

Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data.

Author information

  • 1Bioinformatics Research Center, North Carolina State University, Box 7566, Raleigh, NC 27695-7566, USA. seo@iu.a.u-tokyo.ac.jp

Abstract

Because of the increase of genomic data, multiple genes are often available for the inference of phylogenetic relationships. The simple approach for combining multiple genes from the same taxon is to concatenate the sequences and then ignore the fact that different positions in the concatenated sequence came from different genes. Here, we discuss two criteria for inferring the optimal tree topology from data sets with multiple genes. These criteria are designed for multigene data sets where gene-specific evolutionary features are too important to ignore. One criterion is conventional and is obtained by taking the sum of log-likelihoods over all genes. The other criterion is obtained by dividing the log-likelihood for a gene by its sequence length and then taking the arithmetic mean over genes of these ratios. A similar strategy could be adopted with parsimony scores. The optimal tree is then declared to be the one for which the sum or the arithmetic mean is maximized. These criteria are justified within a two-stage hierarchical framework. The first level of the hierarchy represents gene-specific evolutionary features, and the second represents site-specific features for given genes. For testing significance of the optimal topology, we suggest a two-stage bootstrap procedure that involves resampling genes and then resampling alignment columns within resampled genes. An advantage of this procedure over concatenation is that it can effectively account for gene-specific evolutionary features. We discuss the applicability of the two-stage bootstrap idea to the Kishino-Hasegawa test and the Shimodaira-Hasegawa test.

PMID:
15764703
[PubMed - indexed for MEDLINE]
PMCID:
PMC555482
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk