Format

Send to

Choose Destination
Nucleic Acids Res. 2015 Feb 18;43(3):e15. doi: 10.1093/nar/gku1196. Epub 2014 Nov 20.

Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

Author information

1
Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK Center for Communicable Disease Dynamics, Harvard School of Public Health, 677 Longwood Avenue, Boston, MA 02115, USA Department of Infectious Disease Epidemiology, Imperial College London, St. Mary's Campus, Norfolk Place, London W2 1PG, UK.
2
Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
3
Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK Cardiff School of Biosciences, Sir Martin Evans Building, Museum Avenue, Cardiff CF10 3AX, UK.
4
School of Computing, Engineering and Mathematics, University of Brighton, Brighton BN2 4GJ, UK.
5
Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge CB2 0SP, UK.
6
Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK simon.harris@sanger.ac.uk.

Abstract

The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X.

PMID:
25414349
PMCID:
PMC4330336
DOI:
10.1093/nar/gku1196
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center