Format

Send to

Choose Destination
PLoS One. 2013;8(1):e53608. doi: 10.1371/journal.pone.0053608. Epub 2013 Jan 7.

Taxonomic classification of bacterial 16S rRNA genes using short sequencing reads: evaluation of effective study designs.

Author information

1
Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America.

Abstract

Massively parallel high throughput sequencing technologies allow us to interrogate the microbial composition of biological samples at unprecedented resolution. The typical approach is to perform high-throughout sequencing of 16S rRNA genes, which are then taxonomically classified based on similarity to known sequences in existing databases. Current technologies cause a predicament though, because although they enable deep coverage of samples, they are limited in the length of sequence they can produce. As a result, high-throughout studies of microbial communities often do not sequence the entire 16S rRNA gene. The challenge is to obtain reliable representation of bacterial communities through taxonomic classification of short 16S rRNA gene sequences. In this study we explored properties of different study designs and developed specific recommendations for effective use of short-read sequencing technologies for the purpose of interrogating bacterial communities, with a focus on classification using naïve Bayesian classifiers. To assess precision and coverage of each design, we used a collection of ∼8,500 manually curated 16S rRNA gene sequences from cultured bacteria and a set of over one million bacterial 16S rRNA gene sequences retrieved from environmental samples, respectively. We also tested different configurations of taxonomic classification approaches using short read sequencing data, and provide recommendations for optimal choice of the relevant parameters. We conclude that with a judicious selection of the sequenced region and the corresponding choice of a suitable training set for taxonomic classification, it is possible to explore bacterial communities at great depth using current technologies, with only a minimal loss of taxonomic resolution.

PMID:
23308262
PMCID:
PMC3538547
DOI:
10.1371/journal.pone.0053608
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center