Format

Send to

Choose Destination
Syst Biol. 2014 Jul;63(4):566-81. doi: 10.1093/sysbio/syu023. Epub 2014 Apr 2.

Supertrees Based on the Subtree Prune-and-Regraft Distance.

Author information

1
Faculty of Computer Science, Dalhousie University, 6050 University Avenue, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2.
2
Faculty of Computer Science, Dalhousie University, 6050 University Avenue, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2 beiko@cs.dal.ca.

Abstract

Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Many supertree approaches use optimality criteria that do not reflect underlying processes, have known biases, and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and>50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of benchmark datasets simulated under plausible rates of LGT, we show that SPR supertrees are more similar to correct species histories than supertrees based on parsimony or Robinson-Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera. A Small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT. [Lateral gene transfer; matrix representation with parsimony; phylogenomics; prokaryotic phylogeny; Robinson-Foulds; subtree prune-and-regraft; supertrees.].

PMID:
24695589
PMCID:
PMC4055872
DOI:
10.1093/sysbio/syu023
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center