Format

Send to

Choose Destination
Syst Biol. 2019 Aug 29. pii: syz058. doi: 10.1093/sysbio/syz058. [Epub ahead of print]

Quartet-based computations of internode certainty provide robust measures of phylogenetic incongruence.

Author information

1
Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, Guangzhou, 510642, P.R. China.
2
The Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg D-68159, Germany.
3
Department of Informatics, Institute of Theoretical Informatics, Karlsruhe Institute of Technology, 76128 Karlsruhe, Germany.
4
Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA.

Abstract

Incongruence, or topological conflict, is prevalent in genome-scale data sets. Internode Certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence of a given internal branch among a set of phylogenetic trees and complement regular branch support measures (e.g., bootstrap, posterior probability) that instead assess the statistical confidence of inference. Since most phylogenomic studies contain data partitions (e.g., genes) with missing taxa and IC scores stem from the frequencies of bipartitions (or splits) on a set of trees, IC score calculation typically requires adjusting the frequencies of bipartitions from these partial gene trees. However, when the proportion of missing taxa is high, the scores yielded by current approaches that adjust bipartition frequencies in partial gene trees differ substantially from each other and tend to be overestimates. To overcome these issues, we developed three new IC measures based on the frequencies of quartets, which naturally apply to both complete and partial trees. Comparison of our new quartet-based measures to previous bipartition-based measures on simulated data shows that: 1) on complete data sets, both quartet-based and bipartition-based measures yield very similar IC scores; 2) IC scores of quartet-based measures on a given data set with and without missing taxa are more similar than the scores of bipartition-based measures; and 3) quartet-based measures are more robust to the absence of phylogenetic signal and errors in phylogenetic inference than bipartition-based measures. Additionally, the analysis of an empirical mammalian phylogenomic data set using our quartet-based measures reveals the presence of substantial levels of incongruence for numerous internal branches. An efficient open-source implementation of these quartet-based measures is freely available in the program QuartetScores (https://github.com/lutteropp/QuartetScores).

KEYWORDS:

missing taxa; phylogenetic conflict; phylogenetic discordance; phylogenetic signal; phylogenetics; phylogenomics; robustness

PMID:
31504977
DOI:
10.1093/sysbio/syz058

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center