Format

Send to

Choose Destination
Mol Biol Evol. 2008 May;25(5):960-71. doi: 10.1093/molbev/msn043. Epub 2008 Feb 14.

Calculating bootstrap probabilities of phylogeny using multilocus sequence data.

Author information

1
Professional Programme for Agricultural Bioinformatics, University of Tokyo, 1-1-1 Yayoi Bunkyo-Ku, Tokyo 113-8657, Japan. seo@iu.a.u-tokyo.ac.jp

Abstract

Phylogeny estimation is extremely crucial in the study of molecular evolution. The increase in the amount of available genomic data facilitates phylogeny estimation from multilocus sequence data. Although maximum likelihood and Bayesian methods are available for phylogeny reconstruction using multilocus sequence data, these methods require heavy computation, and their application is limited to the analysis of a moderate number of genes and taxa. Distance matrix methods present suitable alternatives for analyzing huge amounts of sequence data. However, the manner in which distance methods can be applied to multilocus sequence data remains unknown. Here, we suggest new procedures to estimate molecular phylogeny using multilocus sequence data and evaluate its significance in the framework of the distance method. We found that concatenation of the multilocus sequence data may result in incorrect phylogeny estimation with an extremely high bootstrap probability (BP), which is due to incorrect estimation of the distances and intentional ignorance of the intergene variations. Therefore, we suggest that the distance matrices for multilocus sequence data be estimated separately and these matrices be subsequently combined to reconstruct phylogeny instead of phylogeny reconstruction using concatenated sequence data. To calculate the BPs of the reconstructed phylogeny, we suggest that 2-stage bootstrap procedures be adopted; in this, genes are resampled followed by resampling of the sequence columns within the resampled genes. By resampling the genes during calculation of BPs, intergene variations are properly considered. Via simulation studies and empirical data analysis, we demonstrate that our 2-stage bootstrap procedures are more suitable than the conventional bootstrap procedure that is adopted after sequence concatenation.

PMID:
18281270
DOI:
10.1093/molbev/msn043
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center