Send to

Choose Destination
Syst Biol. 2019 Jul 31. pii: syz049. doi: 10.1093/sysbio/syz049. [Epub ahead of print]

Graph Splitting: A Graph-Based Approach for Superfamily-scale Phylogenetic Tree Reconstruction.

Author information

Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo 113-0032, Japan.
Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan.
Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwa, Chiba 277-8564, Japan.


A protein superfamily contains distantly related proteins that have acquired diverse biological functions through a long evolutionary history. Phylogenetic analysis of the early evolution of protein superfamilies is a key challenge because existing phylogenetic methods show poor performance when protein sequences are too diverged to construct an informative multiple sequence alignment. Here, we propose the Graph Splitting (GS) method, which rapidly reconstructs a protein superfamily-scale phylogenetic tree using a graph-based approach. Evolutionary simulation showed that the GS method can accurately reconstruct phylogenetic trees and be robust to major problems in phylogenetic estimation, such as biased taxon sampling, heterogeneous evolutionary rates, and long-branch attraction when sequences are substantially diverged. Its application to an empirical dataset of the triosephosphate isomerase (TIM)-barrel superfamily suggests rapid evolution of protein-mediated pyrimidine biosynthesis, likely taking place after the RNA world. Furthermore, the GS method can also substantially improve performance of widely used multiple sequence alignment methods by providing accurate guide trees.


Bioinformatics; Early evolution; Network analysis; Phylogenetic method; TIM-barrel superfamily


Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center