• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of jcmPermissionsJournals.ASM.orgJournalJCM ArticleJournal InfoAuthorsReviewers
J Clin Microbiol. Sep 2004; 42(9): 3925–3931.
PMCID: PMC516356

rpoB Gene Sequencing for Identification of Corynebacterium Species


The genus Corynebacterium is a heterogeneous group of species comprising human and animal pathogens and environmental bacteria. It is defined on the basis of several phenotypic characters and the results of DNA-DNA relatedness and, more recently, 16S rRNA gene sequencing. However, the 16S rRNA gene is not polymorphic enough to ensure reliable phylogenetic studies and needs to be completely sequenced for accurate identification. The almost complete rpoB sequences of 56 Corynebacterium species were determined by both PCR and genome walking methods. In all cases the percent similarities between different species were lower than those observed by 16S rRNA gene sequencing, even for those species with degrees of high similarity. Several clusters supported by high bootstrap values were identified. In order to propose a method for strain identification which does not require sequencing of the complete rpoB sequence (approximately 3,500 bp), we identified an area with a high degree of polymorphism, bordered by conserved sequences that can be used as universal primers for PCR amplification and sequencing. The sequence of this fragment (434 to 452 bp) allows accurate species identification and may be used in the future for routine sequence-based identification of Corynebacterium species.

The genus Corynebacterium is one of the largest genera in the coryneform group of bacteria (which consist of irregular gram-positive rods and aerobically growing, asporogenous, non-partially acid-fast bacteria). Originally, the genus Corynebacterium was created essentially to accommodate the diphtheria bacillus and some other species pathogenic for animals. Bergey's Manual of Systematic Bacteriology listed only 17 Corynebacterium species; however, 11 new species were defined between 1987 and 1995 (6), and another 32 new species were described between 1996 and 2003. From 2001 to 2003, up to 13 new species were validly published (http://www.bacterio.cict.fr/c/corynebacterium.html). At present, the genus Corynebacterium contains more than 60 species, the vast majority of which have been isolated from human or animal samples. Chemotaxonomically, this genus includes species that possess wall chemotype IV (arabinose, galactose and meso-diaminopimelic acid), short-chain mycolic acids (approximately 22 to 36 carbon atoms), and DNA G+C contents ranging from 51 to 63 mol% (5, 6). The narrower definition of the genus Corynebacterium has resulted in the transfer of several species (Clavibacter, Rhodococcus, and Turicella) to other genera. However, there is still some evidence of heterogeneity within the genus Corynebacterium. For example, Corynebacterium amycolatum and Corynebacterium kroppenstedtii lack mycolic acids (1, 2), Corynebacterium afermentans and Corynebacterium auris exhibit G+C contents of more than 65 mol% (6). The use of molecular genetic methods such as 16S rRNA gene (rDNA) sequence analysis has facilitated a much tighter circumscription of the genus, and the availability of comparative 16S rRNA gene sequence data with improved phenotypic data has resulted in much improved and more reliable species identification (14, 16). These improvements in taxonomy and means of detection, together with an increased interest in Corynebacterium as an opportunistic infectious agent in humans, have resulted in the delineation of a plethora of new Corynebacterium species from human sources in recent years (6, 8). However, the identification of Corynebacterium species is difficult because it often requires fastidious procedures, such as chromatography, or a high number of tests that are not available with commercial identification systems (5). The sequence of the 16S rRNA gene is the most widely used molecular marker to determine the phylogenetic relationships of bacteria. However, low intragenus polymorphism limits its usefulness for taxonomic analysis or identification to the species level. As an example, the species Corynebacterium pseudodiphtheriticum and Corynebacterium propinquum and the species Corynebacterium minutissimum and Corynebacterium aurimucosum have high 16S rDNA similarity values (99.3 and 98.7%, respectively). Moreover, from the perspective of automated systems for gene sequence based-identification, this low degree of polymorphism obligates sequencing of the complete 16S rRNA gene (approximately 1,500 bp) for accurate identification. Variable areas are spotted along the gene at positions 0 to 150, 300 to 400, 650 to 800, 850 to 950, and 1100 to 1250 (a total of 650 bp), with maximal variability ranging from 8 to 19% according to the region (Fig. (Fig.11).

FIG. 1.
Graphical representation of RSVs (y axis) in the rpoB and 16S rRNA gene sequences of Corynebacterium species studied by the use of windows of 50 nucleotides (the x axis indicates the nucleotide position). The hypervariable region bordered by conserved ...

Among the universal genes that can be used for taxonomic analysis and gene sequence-based identification, the RNA polymerase beta subunit-encoding gene (rpoB) was used to study several unrelated genera, including Bartonella spp. (15), Staphylococcus spp. (3), members of the family Enterobacteriaceae (13), Bosea spp. and Afipia spp. (9), Mycobacterium spp. (10), and Legionella spp. (11). In the study described here, we investigated the usefulness of rpoB sequencing for the differentiation and identification of 56 Corynebacterium species and 2 related species, Rhodococcus equi and Turicella otitidis. As rpoB is a large gene (approximately 3,500 bp), we also determined regions of variability in the sequence bordered by conserved sequences with the objective of designing universal primers for amplification of a small but discriminative sequence for use in the routine identification of Corynebacterium species.


Bacterial strains.

The bacterial stains used in this study are listed in Table Table1.1. Most strains were obtained from the Collection de l'Institut Pasteur (CIP) and from the Culture Collection of the University of Göteborg, Göteborg, Sweden (CCUG). All strains were cultured on Columbia agar plates with 5% sheep blood (Trypticase soy agar; bioMérieux, Marcy-l'Etoile, France) and were incubated for 24 to 72 h at 30 to 37°C in a 5% CO2 atmosphere.

Species for which rpoB sequences were determined, including GenBank access numbers and sizes of the sequences determined

rpoB gene amplification and sequencing.

The sequence of the rpoB gene from Corynebacterium species and species most closely related to Corynebacterium species were aligned in order to produce a consensus sequence. The bacteria chosen were Corynebacterium glutamicum, Amycolatopsis mediterranei, and Mycobacterium smegmatis (GenBank accession numbers NC_003450, AF242549, and MSU24494, respectively). The consensus sequence was used to generate primers that were used in PCRs, for genome walking (17), and for sequencing. Additional primers were selected from ongoing base sequence determinations. All primers used in this study are summarized in Table Table22.

Primers used for amplification and sequencing of the rpoB gene in this study

Bacterial DNA was extracted from a heavy suspension of strains by using the QIAamp blood kit (Qiagen, Hilden, Germany), according to the recommendations of the manufacturer. All PCR mixtures contained 2.5 × 10−2 U of Taq polymerase per μl; 1× Taq buffer; 1.8 mM MgCl2 (Gibco BRL, Life Technologies, Cergy Pontoise, France); dATP, dCTP, dTTP, and dGTP (Boehringer Mannheim GmbH, Hilden, Germany) at concentrations of 200 μM each; and each primer (Eurogentec, Seraing, Belgium) at a concentration of 0.2 μM. The PCR mixtures were subjected to 35 cycles of denaturation at 94°C for 30 s, primer annealing for 30 s, and extension at 72°C for 2 min. Every amplification program began with a denaturation step of 95°C for 2 min and ended with a final elongation step of 72°C for 10 min. Determination of the complete sequences of the rpoB sequence ends was achieved by use of the sequences of both the 3′ and the 5′ ends of the gene and amplification by PCR with the Universal GenomeWalker kit (Clontech Laboratories, Palo Alto, Calif.). Briefly, genomic DNA was digested with EcoRV, DraI, PvuII, StuI, and ScaI. The DNA fragments were then ligated with a GenomeWalker adaptor, which had one blunt end and one end with a 5′ overhang. The ligation mixture with the adaptor and the genomic DNA fragments were used as templates for the PCR. This PCR was performed by use of an adaptor primer supplied by the manufacturer and specific primers to walk through the DNA sequence downstream. For the amplification, 1.5 U of Elongase (Boehringer Mannheim) was used with 10 pmol of each primer, each deoxynucleoside triphosphate at a concentration of 20 mM, 10 mM Tris-HCl, 50 mM KCl, 1.6 mM MgCl2, and 5 μl of template with a final volume of 50 μl. Amplicons were purified for sequencing by use of a QIAquick spin PCR purification kit (Qiagen) by the protocol of the supplier. Sequencing reactions were carried out with the reagents of the ABI Prism 3100 DNA sequencer (dRhod.Terminator RR Mix; Perkin-Elmer Applied Biosystems) by the standard automated sequencer protocol.

Determination of discriminative partial sequences in 16S rRNA and rpoB genes.

In order to search for parts of sequences with high degrees of variability bordered by conserved regions, we used SVARAP software (Sequence Variability Analysis Program [http://ifr48.free.fr/recherche/jeu_cadre/jeu_rickettsie.html]) (9). After this analysis was done, the most polymorphic areas in rpoB were identified, and primers designed to be specific for the border conserved region were used for PCR amplification of this region. The PCR conditions that incorporated this consensus primer pair (C2700F-C3130R; Table Table2)2) were the same as those described above. These primers were then used for amplification and sequencing of the hypervariable region for all the strains studied in this work.

rpoB sequence analysis.

The nucleotide sequences of the rpoB gene fragments obtained were processed into sequence data with Sequence Analysis Software (Applied Biosystems), and partial sequences were combined into a single consensus sequence with Sequence Assembler Software (Applied Biosystems). All GenBank accession numbers are listed in Table Table1.1. Multiple-sequence alignments and percent similarities of the rpoB and 16S rRNA genes between the different species were obtained with the CLUSTAL W program (18) on the EMBL-EBI web server (http://www.ebi.ac.uk/clustalw/). Phylogenetic trees were obtained from DNA sequences by three different methods: the neighbor-joining, maximum-parsimony, and maximum-likelihood methods (4). Bootstrap replicates were performed in order to estimate the reliabilities of the nodes of the phylogenetic trees obtained. Bootstrap values were obtained from 1,000 trees generated randomly with the SEQBOOT program in the PHYLIP software package.


rpoB sequences of Corynebacterium species.

Almost complete rpoB gene sequences were determined for all strains. The rpoB sequence was more polymorphic than the 16S rDNA sequence. This higher degree of polymorphism was particularly evident for species not well differentiated by 16S rDNA sequence analysis (Table (Table3),3), as among the 11 pairs of species with 16S rRNA gene similarities ranging from 98.5 to 99.7%, the similarities of the rpoB gene ranged from 84.9 to 96.6%. The means for the similarities between the 16S rRNA gene and rpoB gene sequences among these 11 pairs were statistically significant. This higher degree of polymorphism was also significant when it was calculated on the basis of range site variability (RSV) (Fig. (Fig.1).1). RSVs of ≥10 were observed in the rpoB gene for 44 of 67 windows of 50 nucleotides (WOFN) and in the 16S rRNA gene for 5 of 27 WOFN (P < 0.001 by the chi-square test). Likewise, RSVs of ≥20 were observed in the rpoB gene for 13 of 67 WOFN and in the 16S rRNA gene for 0 of 27 WOFN (P = 0.008 by Fisher's exact test). The similarity between the two C. afermentans subspecies was 98.2% and, thus, was 1.6% above the highest level of similarity between two species.

Comparison of similarities of 16S rRNA and rpoB gene sequences between the two subspecies of C. afermentans and among the 11 pairs of closely related species, with statistical comparison of mean similarities

Phylogenetic analysis.

On the basis of rpoB gene sequence analysis, phylogenetic analysis by the neighbor-joining, maximum-parsimony, and maximum-likelihood methods provided similar and reliable organizations for the four clusters supported by high bootstrap values (Fig. (Fig.2).2). On the contrary, only cluster 4 was evidenced when 16S rRNA gene sequence analysis was used (Fig. (Fig.3).3). The bootstrap values at the nodes were in all cases higher than those observed by 16S rRNA gene sequencing. Values ≥95% were observed for 14 of 55 nodes for the 16S rRNA gene, whereas values ≥95% were observed for 24 of 55 nodes for the rpoB gene (P = 0.004 by the chi-square test). For some species, such as Corynebacterium testudinoris, Corynebacterium renale, Corynebacterium seminale, and Corynebacterium glucuronolyticum, the phylogenetic position was more difficult to assess. The position of T. otitidis in a genus separate from Corynebacterium is also questionable. Study of the rpoB gene confirms that the genus Rhodococcus is different from Corynebacterium and that Corynebacterium hoagii is not another species but is R. equi and that C. seminale and C. glucuronolyticum are the same species (http://www.bacterio.cict.fr/c/corynebacterium.html).

FIG. 2.
Dendrogram representing the phylogenetic relationships of Corynebacterium species obtained by the neighbor-joining method. The tree was derived from the alignments of rpoB gene sequences. The support of each branch, as determined from 1,000 bootstrap ...
FIG. 3.
Dendrogram representing the phylogenetic relationships of Corynebacterium species obtained by the neighbor-joining method. The tree was derived from alignment of 16S rRNA gene sequences. The support of each branch, as determined from 1,000 bootstrap samples, ...

Strain identification.

Four highly variable zones were determined by the use of SVARAP software (Fig. (Fig.1).1). These zones were between positions 1 and 450, 800 and 1100, 1400 and 1750, and 2750 and 3200. Attempts to design universal primers that amplify hypervariable areas were unsuccessful for the first three regions. We designed a consensus primer pair (C2700F-C3130R) that allowed the successful amplification of the region in all Corynebacterium species, R. equi, and T. otitidis between positions 2750 and 3200. The amplified fragment was from 434 to 452 bp, depending on the species. Interestingly, this region was the most variable one (Fig. (Fig.1).1). The similarities observed in the partial rpoB sequence were also significantly less than those observed in the 16S rRNA gene and ranged from 87.9 to 95.9% (Table (Table3).3). The similarity between the two C. afermentans subspecies was 96.6% and was thus 0.7% greater than the highest degree of similarity between two species.


The description of new bacterial species at present is based on the results of DNA-DNA hybridization and the description of phenotypic characteristics, so-called polyphasic classification data (7, 19). However, DNA-DNA hybridization is difficult to perform, expensive, technically complex, and labor-intensive. The scarcity of reproducible and distinguishable characters frequently limits phenotypic characterization and, thus, phenotype-based identification in routine clinical microbiology laboratories. The development of gene amplification and sequencing, especially that of 16S rRNA gene sequences, has simplified the taxonomy and identification of bacteria, particularly those lacking distinguishable phenotypic characteristics. However, the 16S rDNA sequences of Corynebacterium spp. are not variable enough to ensure confident results from phylogenetic studies based on high bootstrap values (Fig. (Fig.3)3) or to allow determination of a short sequence for accurate identification (Fig. (Fig.1).1). Our data, based on the rpoB sequences of these bacteria, confirm that this gene is significantly more polymorphic than the 16S rRNA gene, and we propose that it be used to replace or complement the 16S rRNA gene for phylogenetic studies of Corynebacterium. Deeply branching nodes were supported by high bootstrap values and allowed the identification of four clusters (Fig. (Fig.2).2). Even among species not resolved into clusters, some groups of bacteria were confidently identified, such as groups containing Corynebacterium diphtheriae, Corynebacterium pseudotuberculosis, Corynebacterium ulcerans, and Corynebacterium kutscheri.

The high similarity values for the 16S rRNA gene sequences observed among closely related Corynebacterium spp. indicate that the complete sequence should be determined for accurate sequence-based identification (Table (Table3).3). By using SVARAP software, we have designed universal primers for rpoB that allow amplification and sequencing of a 434- to 452-bp fragment polymorphic enough to ensure accurate identification of all Corynebacterium spp. The highest degree of similarity of this partial sequence between two species was 95.9%, whereas it was 99.7% for the complete 16S rRNA gene (Table (Table3),3), a sequence nearly four times longer. Moreover, the partial sequences of the rpoB genes of two subspecies of C. afermentans had a similarity of 96.6%, which was thus 0.7% above the limit of similarity between two different species. This difference was only 0.1% for the complete 16S rRNA gene sequence, rendering it impossible to distinguish a subspecies from a closely related species only on the basis of this sequence. This difference was even higher (1.6%) when the complete rpoB sequence was considered. From these data, the cutoff for the definition of species and subspecies in the genus Corynebacterium based on the complete rpoB sequence can be made on the basis of similarities of <96.6 and >98%, respectively. These cutoffs are in the same range as those observed for the genera Bartonella, Afipia, and Bosea (12, 9). However, the similarities of a large collection of different strains within particular species would have to be determined for validation of these cutoffs.


We are indebted to E. Falsen for providing some of the Corynebacterium strains as a gift.


1. Collins, M. D., R. A. Burton, and D. Jones. 1998. Corynebacterium amycolatum sp. nov., a new mycolic acid-less Corynebacterium species from human skin. FEMS Microbiol. Lett. 49:349-352.
2. Collins, M. D., E. Falsen, E. Akervall, B. Sjödén, and N. Alvarez. 1998. Characterization of a novel non-mycolic acid containing Corynebacterium: description of Corynebacterium kroppenstedtii sp. nov. Int. J. Syst. Bacteriol. 48:1449-1454. [PubMed]
3. Drancourt, M., and D. Raoult. 2002. rpoB gene sequence-based identification of Staphylococcus species. J. Clin. Microbiol. 40:1333-1338. [PMC free article] [PubMed]
4. Felsenstein, J. 1989. PHYLIP—phylogeny inference package (version 3.2). Cladistics 5:164-166.
5. Funke, G., and K. A. Bernard. 1999. Coryneform gram-positive rods, p. 319-345. In P. R. Murray, E. J. Baron, M. A. Pfaller, F. C. Tenover, and R. H. Yolken (ed.), Manual of clinical microbiology, 7th ed. American Society for Microbiology, Washington D.C.
6. Funke, G., A. von Graevenitz, J. E. Clarridge III, and K. A. Bernard. 1997. Clinical microbiology of coryneform bacteria. Clin. Microbiol. Rev. 10:125-159. [PMC free article] [PubMed]
7. Grimont, P. A. D. 1988. Use of DNA reassociation in bacterial classification. Can. J. Microbiol. 34:541-546. [PubMed]
8. Janda, W. M. 1998. Corynebacterium species and the coryneform bacteria, part I: new and emerging species in the genus Corynebacterium. Clin. Microbiol. Newsl. 20:41-52.
9. Khamis, A., P. Colson, D. Raoult, and B. La Scola. 2003. Usefulness of rpoB gene sequencing for identification of Afipia and Bosea species, including a strategy for the choice of discriminative partial sequences. Appl. Environ. Microbiol. 69:6740-6749. [PMC free article] [PubMed]
10. Kim, B. J., S. H. Lee, M. A. Lyu, S. J. Kim, G. H. Bai, G. T. Chae, E. C. Kim, C. Y. Cha, and Y. H. Kook. 1999. Identification of mycobacterial species by comparative sequence analysis of the RNA polymerase gene (rpoB). J. Clin. Microbiol. 37:1714-1720. [PMC free article] [PubMed]
11. Ko, K. S., H. K. Lee, M. Y. Park, K. H. Lee, Y. J. Yun, S. Y. Woo, H. Miyamoto, and Y. H. Kook. 2002. Application of RNA polymerase beta-subunit gene (rpoB) sequences for the molecular differentiation of Legionella species. J. Clin. Microbiol. 40:2653-2658. [PMC free article] [PubMed]
12. La Scola, B., Z. Zeaiter, A. Khamis, and D. Raoult. 2003. Gene sequence based criteria for species definition in bacteriology: the Bartonella paradigm. Trends Microbiol. 11:318-321. [PubMed]
13. Mollet, C., M. Drancourt, and D. Raoult. 1997. rpoB sequence analysis as a novel basis for bacterial identification. Mol. Microbiol. 26:1005-1011. [PubMed]
14. Pascual, C., P. A. Lawson, J. A. E. Farrow, M. N. Gimenez, and M. D. Collins. 1995. Phylogenetic analysis of the genus Corynebacterium based on the 16S rRNA gene sequences. Int. J. Syst. Bacteriol. 45:724-728. [PubMed]
15. Renesto, P., J. Gouvernet, M. Drancourt, V. Roux, and D. Raoult. 2001. Use of rpoB gene analysis for detection and identification of Bartonella species. J. Clin. Microbiol. 39:430-437. [PMC free article] [PubMed]
16. Ruimy, R., P. Riegel, P. Boiron, H. Montiel, and R. Christen. 1995. Phylogeny of the genus Corynebacterium deduced from analyses of small-subunit ribosomal DNA sequences. Int. J. Syst. Bacteriol. 45:740-746. [PubMed]
17. Siebert, P. D., A. Chenchik, D. E. Kellogg, K. A. Lukyanov, and S. A. Lukyanov. 1995. An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res. 23:1087-1088. [PMC free article] [PubMed]
18. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680. [PMC free article] [PubMed]
19. Wayne, L. G., D. J. Brenner, R. R. Colwell, P. A. D. Grimont, O. Kandler, M. L. Krichevsky, L. H. Moore, W. E. C. Moore, R. G. E. Murray, E. Stackebrandt, M. P. Starr, and H. G. Trüper. 1987. Report of the Ad Hoc Committee on Reconciliation of Approaches to Bacterial Systematics. Int. J. Syst. Bacteriol. 37:463-464.

Articles from Journal of Clinical Microbiology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...