The mathematics of xenology: di-cographs, symbolic ultrametrics, 2-structures and tree-representable systems of binary relations

J Math Biol. 2017 Jul;75(1):199-237. doi: 10.1007/s00285-016-1084-3. Epub 2016 Nov 30.

Abstract

The concepts of orthology, paralogy, and xenology play a key role in molecular evolution. Orthology and paralogy distinguish whether a pair of genes originated by speciation or duplication. The corresponding binary relations on a set of genes form complementary cographs. Allowing more than two types of ancestral event types leads to symmetric symbolic ultrametrics. Horizontal gene transfer, which leads to xenologous gene pairs, however, is inherent asymmetric since one offspring copy "jumps" into another genome, while the other continues to be inherited vertically. We therefore explore here the mathematical structure of the non-symmetric generalization of symbolic ultrametrics. Our main results tie non-symmetric ultrametrics together with di-cographs (the directed generalization of cographs), so-called uniformly non-prime ([Formula: see text]) 2-structures, and hierarchical structures on the set of strong modules. This yields a characterization of relation structures that can be explained in terms of trees and types of ancestral events. This framework accommodates a horizontal-transfer relation in terms of an ancestral event and thus, is slightly different from the the most commonly used definition of xenology. As a first step towards a practical use, we present a simple polynomial-time recognition algorithm of [Formula: see text] 2-structures and investigate the computational complexity of several types of editing problems for [Formula: see text] 2-structures. We show, finally that these NP-complete problems can be solved exactly as Integer Linear Programs.

Keywords: 2-Structures; Di-cograph; Gene tree; Integer Linear Program; NP-completeness; Orthologs; Paralogs; Recognition algorithm; Symbolic ultrametric; Uniformly non-prime decomposition; Xenologs.

MeSH terms

  • Evolution, Molecular*
  • Gene Transfer, Horizontal
  • Models, Biological*
  • Phylogeny*