Format

Send to

Choose Destination
Algorithms Mol Biol. 2017 Mar 9;12:3. doi: 10.1186/s13015-017-0095-y. eCollection 2017.

Approximating the DCJ distance of balanced genomes in linear time.

Author information

1
Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Campo Grande, MS Brazil.
2
Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany.

Abstract

BACKGROUND:

Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard.

RESULTS:

We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem.

CONCLUSIONS:

Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions.

KEYWORDS:

Approximation algorithms; Comparative genomics; Double cut and join (DCJ); Genome rearrangements

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center