Format

Send to

Choose Destination
Bioinformatics. 2014 Sep 1;30(17):i541-8. doi: 10.1093/bioinformatics/btu462.

ASTRAL: genome-scale coalescent-based species tree estimation.

Author information

1
Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA, Departement d'informatique, Ecole Normale Superieure, 45 Rue d'Ulm, F-75230 Paris Cedex 05, France and Department of Electrical Engineering, The University of Southern California, Los Angeles, CA 90089, USA.
2
Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA, Departement d'informatique, Ecole Normale Superieure, 45 Rue d'Ulm, F-75230 Paris Cedex 05, France and Department of Electrical Engineering, The University of Southern California, Los Angeles, CA 90089, USA Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA, Departement d'informatique, Ecole Normale Superieure, 45 Rue d'Ulm, F-75230 Paris Cedex 05, France and Department of Electrical Engineering, The University of Southern California, Los Angeles, CA 90089, USA.

Abstract

MOTIVATION:

Species trees provide insight into basic biology, including the mechanisms of evolution and how it modifies biomolecular function and structure, biodiversity and co-evolution between genes and species. Yet, gene trees often differ from species trees, creating challenges to species tree estimation. One of the most frequent causes for conflicting topologies between gene trees and species trees is incomplete lineage sorting (ILS), which is modelled by the multi-species coalescent. While many methods have been developed to estimate species trees from multiple genes, some which have statistical guarantees under the multi-species coalescent model, existing methods are too computationally intensive for use with genome-scale analyses or have been shown to have poor accuracy under some realistic conditions.

RESULTS:

We present ASTRAL, a fast method for estimating species trees from multiple genes. ASTRAL is statistically consistent, can run on datasets with thousands of genes and has outstanding accuracy-improving on MP-EST and the population tree from BUCKy, two statistically consistent leading coalescent-based methods. ASTRAL is often more accurate than concatenation using maximum likelihood, except when ILS levels are low or there are too few gene trees.

AVAILABILITY AND IMPLEMENTATION:

ASTRAL is available in open source form at https://github.com/smirarab/ASTRAL/. Datasets studied in this article are available at http://www.cs.utexas.edu/users/phylo/datasets/astral.

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
25161245
PMCID:
PMC4147915
DOI:
10.1093/bioinformatics/btu462
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center