Format

Send to

Choose Destination
Bioinformatics. 2014 Sep 1;30(17):i519-26. doi: 10.1093/bioinformatics/btu463.

Polytomy refinement for the correction of dubious duplications in gene trees.

Author information

1
Department of Computer Science, Université de Montréal, Montréal, Quebec H3C 3J7, Canada, LaBRI, Université Bordeaux 1, Bordeaux, France, Department of Mathematics, Simon Fraser University, Burnaby (BC) V5A 1S6, Canada and Universitá degli Studi di Bergamo, Bergamo 24129 IT, Italy.
2
Department of Computer Science, Université de Montréal, Montréal, Quebec H3C 3J7, Canada, LaBRI, Université Bordeaux 1, Bordeaux, France, Department of Mathematics, Simon Fraser University, Burnaby (BC) V5A 1S6, Canada and Universitá degli Studi di Bergamo, Bergamo 24129 IT, Italy Department of Computer Science, Université de Montréal, Montréal, Quebec H3C 3J7, Canada, LaBRI, Université Bordeaux 1, Bordeaux, France, Department of Mathematics, Simon Fraser University, Burnaby (BC) V5A 1S6, Canada and Universitá degli Studi di Bergamo, Bergamo 24129 IT, Italy.

Abstract

MOTIVATION:

Large-scale methods for inferring gene trees are error-prone. Correcting gene trees for weakly supported features often results in non-binary trees, i.e. trees with polytomies, thus raising the natural question of refining such polytomies into binary trees. A feature pointing toward potential errors in gene trees are duplications that are not supported by the presence of multiple gene copies.

RESULTS:

We introduce the problem of refining polytomies in a gene tree while minimizing the number of created non-apparent duplications in the resulting tree. We show that this problem can be described as a graph-theoretical optimization problem. We provide a bounded heuristic with guaranteed optimality for well-characterized instances. We apply our algorithm to a set of ray-finned fish gene trees from the Ensembl database to illustrate its ability to correct dubious duplications.

AVAILABILITY AND IMPLEMENTATION:

The C++ source code for the algorithms and simulations described in the article are available at http://www-ens.iro.umontreal.ca/~lafonman/software.php.

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
25161242
PMCID:
PMC4147916
DOI:
10.1093/bioinformatics/btu463
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center