Accurate, rapid taxonomic classification of fungal large-subunit rRNA genes

Appl Environ Microbiol. 2012 Mar;78(5):1523-33. doi: 10.1128/AEM.06826-11. Epub 2011 Dec 22.

Abstract

Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Classification / methods*
  • Computational Biology / methods*
  • Fungi / classification*
  • Fungi / genetics*
  • Genes, rRNA / genetics*
  • RNA, Fungal / genetics*
  • RNA, Ribosomal / genetics*
  • Sensitivity and Specificity
  • Time Factors

Substances

  • RNA, Fungal
  • RNA, Ribosomal
  • RNA, ribosomal, 26S