Format

Send to

Choose Destination
Bioinformatics. 2017 Jan 1;33(1):119-121. doi: 10.1093/bioinformatics/btw586. Epub 2016 Sep 7.

stringMLST: a fast k-mer based tool for multilocus sequence typing.

Author information

1
School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.
2
Applied Bioinformatics Laboratory, Atlanta, GA 30332, USA.
3
PanAmerican Bioinformatics Institute, Cali, Valle del Cauca 760043, Colombia.

Abstract

Rapid and accurate identification of the sequence type (ST) of bacterial pathogens is critical for epidemiological surveillance and outbreak control. Cheaper and faster next-generation sequencing (NGS) technologies have taken preference over the traditional method of amplicon sequencing for multilocus sequence typing (MLST). But data generated by NGS platforms necessitate quality control, genome assembly and sequence similarity searching before an isolate's ST can be determined. These are computationally intensive and time consuming steps, which are not ideally suited for real-time molecular epidemiology. Here, we present stringMLST, an assembly- and alignment-free, lightweight, platform-independent program capable of rapidly typing bacterial isolates directly from raw sequence reads. The program implements a simple hash table data structure to find exact matches between short sequence strings (k-mers) and an MLST allele library. We show that stringMLST is more accurate, and order of magnitude faster, than its contemporary genome-based ST detection tools.

AVAILABILITY AND IMPLEMENTATION:

The source code and documentations are available at http://jordan.biology.gatech.edu/page/software/stringMLST CONTACT: lavanya.rishishwar@gatech.eduSupplementary information: Supplementary data are available at Bioinformatics online.

PMID:
27605103
DOI:
10.1093/bioinformatics/btw586
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center