Format

Send to

Choose Destination
BMC Genomics. 2016 Nov 2;17(1):847.

In silico region of difference (RD) analysis of Mycobacterium tuberculosis complex from sequence reads using RD-Analyzer.

Faksri K1,2, Xia E3, Tan JH4, Teo YY3,4,5,6,7, Ong RT8.

Author information

1
Department of Microbiology Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand.
2
Research and Diagnostic Center for Emerging Infectious Diseases (RCEID), Khon Kaen University, Khon Kaen, Thailand.
3
NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore, Singapore.
4
Saw Swee Hock School of Public Health, National University of Singapore, Tahir Foundation Building, 12 Science Drive 2, #10-01, Singapore, 117549, Singapore.
5
Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore.
6
Life Sciences Institute, National University of Singapore, Singapore, Singapore.
7
Genome Institute of Singapore, Singapore, Singapore.
8
Saw Swee Hock School of Public Health, National University of Singapore, Tahir Foundation Building, 12 Science Drive 2, #10-01, Singapore, 117549, Singapore. twee_hee_ong@nuhs.edu.sg.

Abstract

BACKGROUND:

Whole-genome sequencing is increasingly used in clinical diagnosis of tuberculosis and study of Mycobacterium tuberculosis complex (MTC). MTC consists of several genetically homogenous mycobacteria species which can cause tuberculosis in humans and animals. Regions of difference (RDs) are commonly regarded as gold standard genetic markers for MTC classification.

RESULTS:

We develop RD-Analyzer, a tool that can accurately infer the species and lineage of MTC isolates from sequence reads based on the presence and absence of a set of 31 RDs. Applied on a publicly available diverse set of 377 sequenced MTC isolates from known major species and lineages, RD-Analyzer achieved an accuracy of 98.14 % (370/377) in species prediction and a concordance of 98.47 % (257/261) in Mycobacterium tuberculosis lineage prediction compared to predictions based on single nucleotide polymorphism markers. By comparing respective sequencing read depths on each genomic position between isolates of different sublineages, we were able to identify the known RD markers in different sublineages of Lineage 4 and provide support for six potential delineating markers having high sensitivities and specificities for sublineage prediction. An extended version of RD-Analyzer was thus developed to allow user-defined RDs for lineage prediction.

CONCLUSIONS:

RD-Analyzer is a useful and accurate tool for species, lineage and sublineage prediction using known RDs of MTC from sequence reads and is extendable to accepting user-defined RDs for analysis. RD-Analyzer is written in Python and is freely available at https://github.com/xiaeryu/RD-Analyzer .

KEYWORDS:

Mycobacterium tuberculosis complex; Region of difference analysis; Whole-genome sequence analysis

PMID:
27806686
PMCID:
PMC5093977
DOI:
10.1186/s12864-016-3213-1
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center