Send to

Choose Destination
J Bioinform Comput Biol. 2013 Dec;11(6):1343005. doi: 10.1142/S0219720013430051. Epub 2013 Dec 2.

A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.

Author information

Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore , School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Singapore.


We propose an extension to alignment-free approaches that can produce reasonably accurate phylogenetic groupings starting from unaligned genomes, for example, as fast as 1 min on a standard desktop computer for 25 bacterial genomes. A 6-fold speed-up and 11-fold reduction in memory requirements compared to previous alignment-free methods is achieved by reducing the comparison space to a representative sample of k-mers of optimal length and with specific tag motifs. This approach was applied to the test case of fitting the enterohemorrhagic O104:H4 E.coli strain from the 2011 outbreak in Germany into the phylogenetic network of previously known E.coli-related strains and extend the method to allow assigning any new strain to the correct phylogenetic group even directly from unassembled short sequence reads from next generation sequencing data. Hence, this approach is also useful to quickly identify the most suitable reference genome for subsequent assembly steps.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center