• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of aemPermissionsJournals.ASM.orgJournalAEM ArticleJournal InfoAuthorsReviewers
Appl Environ Microbiol. Apr 2009; 75(8): 2284–2293.
Published online Feb 20, 2009. doi:  10.1128/AEM.02621-08
PMCID: PMC2675225

Characterization of a β-Glucoside Operon (bgc) Prevalent in Septicemic and Uropathogenic Escherichia coli Strains[down-pointing small open triangle]


Escherichia coli strains, in general, do not ferment cellobiose and aryl-β-d-glucosidic sugars, although “cryptic” β-d-glucoside systems have been characterized. Here we describe an additional cryptic operon (bgc) for the utilization of cellobiose and the aryl-β-d-glucosides arbutin and salicin at low temperature. The bgc operon was identified by the characterization of β-glucoside-positive mutants of an E. coli septicemia strain (i484) in which the well-studied bgl (aryl-β-d-glucoside) operon was deleted. These bgc* mutants appeared after 5 days of incubation on salicin indicator plates at 28°C. The bgc operon codes for proteins homologous to β-glucoside/cellobiose-specific phosphoenolpyruvate-dependent phosphotransfer system permease subunits IIB (BgcE), IIC (BgcF), and IIA (BgcI); a porin (BgcH); and a phospho-β-d-glucosidase (BgcA). Next to the bgc operon maps the divergent bgcR gene, which encodes a GntR-type transcriptional regulator. Expression of the bgc operon is dependent on the cyclic-AMP-dependent regulator protein CRP and positively controlled by BgcR. In the bgc* mutants, a single nucleotide exchange enhances the activity of the bgc promoter, rendering it BgcR independent. Typing of a representative collection of E. coli demonstrated the prevalence of bgc in strains of phylogenetic group B2, representing mainly extraintestinal pathogens, while it is rare among commensal E. coli strains. The bgc locus is also present in the closely related species Escherichia albertii. Further, bioinformatic analyses demonstrated that homologs of the bgc genes exist in the enterobacterial Klebsiella, Enterobacter, and Citrobacter spp. and also in gram-positive bacteria, indicative of horizontal gene transfer events.

Members of the family Enterobacteriaceae differ in the ability to utilize cellobiose and other β-glucosides (17, 33). Phytopathogenic enterobacteria such as Erwinia species ferment β-glucosidic sugars (2, 3, 11). Likewise Klebsiella, Aerobacter, Citrobacter, Hafnia, and Serratia species can utilize β-glucosides (28, 33). In contrast, Escherichia coli and Salmonella species do not ferment β-glucosides. However, β-glucoside-positive spontaneous mutants of most E. coli strains which ferment salicin and arbutin can be isolated (26, 32), while this is not the case for Salmonella sp. (33).

In E. coli, several loci for the utilization of β-glucosides have been characterized. These include the “cryptic” bgl, asc, and arbT loci, as well as the constitutively expressed bglA gene. Among these, the bgl operon is the best characterized (29, 36). The bgl operon is repressed by the nucleoid-associated protein H-NS (10, 19, 24, 34). In the laboratory strain K-12, silencing of bgl by H-NS can be relieved by various mutations, which arise quickly on indicator plates. Consequently, expression of bgl becomes inducible by substrate (20, 24). The bgl operon is present in the majority of E. coli strains and functional in most of them. Interestingly, silencing of the bgl operon is less strict in uropathogenic E. coli and related strains when they are grown at 37°C (31). In contrast, Asc (arbutin, salicin, cellobiose)-positive mutants arise only after prolonged incubation for 4 to 5 weeks (13, 25). In the asc mutants, the ascG gene is disrupted by an IS186 insertion. AscG is a repressor of the divergent ascFB operon, which encodes a phosphoenolpyruvate-dependent phosphotransfer system (PTS) enzyme II permease for arbutin, salicin, and cellobiose and a phospho-β-glucosidase, respectively (13). The constitutively expressed bglA gene encodes an arbutin-specific phospho-β-d-glucosidase (27), and the cryptic arbT locus encodes an arbutin-specific enzyme II permease of the PTS (18). The arbT locus has not been mapped but may be identical to glvCBG, which putatively encodes β-d-glucoside-specific enzyme IIB and IIC subunits and a phospho-β-d-glucosidase (22). Furthermore, multiple mutations in the chb operon, which is an inducible N,N′-diacetylchitobiose system, can convert chb to a cellobiose-specific system (14a, 15).

In this report, we describe an additional β-glucoside operon (named bgc for aryl-β-d-glucosides and cellobiose) which is prevalent in E. coli. The bgc locus was identified when we observed that an E. coli septicemia strain (i484) in which the bgl operon was deleted yielded β-glucoside-positive mutants after a few days of incubation on salicin plates at low temperature. The locus was mapped, and the mutations causing activation were characterized. In addition, evolution of the locus was addressed by phylogenetic studies and typing of a representative collection of pathogenic and commensal E. coli strains.


Strains, plasmids, and media.

The genotypes of the E. coli strains and the relevant structures of the plasmids used in this study are shown in Table Table1.1. Transductions were performed with phage T4GT7 as described previously (9, 40). The bgl operon in strain i484 was deleted with temperature-sensitive plasmid pFMAC11 as described previously (5, 14). The deletion of bgl was confirmed by PCR. Plasmids were constructed according to standard techniques (30). Briefly, for construction of the wild-type bgc and mutant bgc* promoter-lacZ fusions, fragments were amplified with primers S432 (GCTCTAGATTTTTCCTTACTGGTATATAACAGACTACATT) and S433 (CCGGTCGACGAATTCTTTTGCACCTTTTAAGAGCCATT) (recognition sites for restriction enzymes XbaI, SalI, and EcoRI, respectively, are underlined). For construction of the bgc and bgc* fragments, which include the divergent bgcR gene primers S432 and S368 (CCGCTCGAGCGGCCGCGTCGACGGCGTAAAGCGGTAAAGGTCA, with the XhoI site underlined) were used. The PCR fragments were digested with XbaI and SalI or XhoI and cloned into pACYC-derived, SalI- and XbaI-digested vector pKEKB30 (10). All cloned PCR fragments were sequenced. Phenotypes were analyzed with bromthymol blue (BTB) indicator plates (9) containing 0.5% salicin or cellobiose or with MacConkey (Difco) arbutin (0.5%) plates. Antibiotics were added to 12 μg/ml tetracycline, 25 μg/ml kanamycin, 50 μg/ml ampicillin, 15 μg/ml chloramphenicol, and 50 μg/ml spectinomycin (final concentrations) when necessary.

Strains and plasmids used in this study

Transposon mutagenesis.

Transposon mutagenesis screens were performed with pKESK18 carrying a mini-Tn10-Catr transposon as described previously (20). In brief, replication of this plasmid is temperature sensitive and the expression of the Tn10 transposase is temperature regulated. Thus, at 28°C the plasmid replicates whereas the transposase is not expressed. Upon a temperature shift to 42°C, expression of the transposase gene and thus transposition is induced, while replication of the plasmid stops, allowing the selection of transposon mutants on chloramphenicol plates at 42°C. Transformants of strains KEC131 to KEC134 with plasmid pKESK18 were grown overnight at 28°C and then streaked onto LB chloramphenicol plates for the selection of transposon mutants at 42°C. These mutants were replica plated on BTB salicin indicator plates and incubated at 28°C, and salicin-negative mutants were isolated. The insertion position of the mini-Tn10-Catr transposon was determined by sequencing by a semirandom two-step PCR protocol as described previously (6, 20).

β-Galactosidase assay.

For enzyme assays, cells were grown in M9 medium containing 0.4% (wt/vol) glycerol or glucose, 0.66% (wt/vol) Casamino Acids (Difco), and 1 μg/ml vitamin B1 or in LB medium (Difco), as indicated. Cultures were inoculated to an optical density at 600 nm (OD600) of 0.1 to 0.15 from fresh overnight cultures and grown in the same medium to an OD600 of 0.5 at 37 or 28°C, as indicated. Salicin was added at a 0.2% final concentration to the overnight and exponential cultures where indicated. β-Galactosidase assays were performed as described previously (9, 23). Enzyme activities were determined at least three times from at least two independent transformants or integration derivatives. Standard deviations were less than 10%.

Bioinformatic and phylogenetic analyses.

For the analysis of the prevalence of bgc in E. coli (including Shigella strains) the nucleotide sequence of the entire bgc locus, including the flanking core genes marB and ydeD, of strain CFT073 was used as a query to search the NCBI nonredundant nucleotide database with BLASTN. The sequences of the marB-bgc-ydeD and marB-ydeD loci were then extracted from the published E. coli and Shigella genome sequences, as well as from E. albertii, and used for structural comparison, as well as for phylogenetic analysis. For the latter, the sequences were aligned and a neighbor-joining (NJ) tree was constructed with MEGA4 (37). In strains in which the bgc locus is disrupted by insertion elements, these sequences were manually removed. To correlate the phylogeny of the bgc locus with the species phylogeny, the sequences of the seven multilocus sequence typing (MLST) loci (41) were also extracted from the genome sequences, concatenated, and used to construct an NJ tree with MEGA4. To identify homologues of the genes encoded by the bgc locus, the deduced protein sequences from E. coli CFT073 were used as queries to search the NCBI microbial genome database, as well as the NCBI nonredundant nucleotide database, with TBLASTN. In parallel, the UniProt database (38) was searched for homologues with BLASTP. Similar results were obtained in these searches.

Typing of the bgc locus.

Typing for the presence of bgc was performed by PCR with oligonucleotides S693 (AACGTGACAACGTCACTGAGGCAAT; specific for the flanking gene marB), S694 (AACGGTCAGCATGTGGCGATG; specific for ydeD), and S695 (TGAAATCGCCAGTATTTTACGGATCAG; specific for bgcR). The PCR fragments were sequenced to confirm specific amplification and to analyze for sequence variations. Strains which yielded a PCR fragment of 770 bp with oligonucleotides S694 and S695 carry the bgcR gene and were assigned to bgc type I. Strains from which a specific 690-bp PCR fragment could be amplified with primers S693 and S694 lack bgc and were assigned to type II. These types were color coded and mapped onto a minimal spanning tree representing the population structure of a representative E. coli collection based on MLST as described previously (31, 41).


E. coli strain i484 carries an additional β-d-glucoside system.

Septicemia-causing E. coli strain i484 carries the H-NS-repressed bgl operon (16). In the course of experiments with bgl, we constructed a derivative of i484 with a deletion of the bgl operon. This i484Δbgl strain (KEC93) was salicin and arbutin negative at 37 and 28°C, as expected. Surprisingly, though, i484Δbgl yielded salicin-positive papillae on BTB salicin indicator plates after 5 days of incubation at 28°C but not at 37°C (Fig. (Fig.1).1). Four of the salicin-positive papillae were restreaked, and these four mutants (KEC131, KEC132, KEC133, and KEC134) were salicin positive at 28°C but negative at 37°C (Fig. (Fig.1;1; see Table Table3),3), indicating the existence of an additional system for β-glucoside utilization whose expression can be activated by mutations.

FIG. 1.
Identification of the cryptic bgc (β-glucoside and cellobiose) operon in E. coli i484. (A) E. coli strain i484Δbgl (KEC93), which is a derivative of septicemia strain i484 (16), yields salicin-positive (Sal+) mutants at 28°C. ...
β-d-Glucoside phenotypes of the strains used in this study

To identify the locus responsible for the salicin-positive phenotype, mini-Tn10-Catr transposon mutagenesis of each of the four independent β-glucoside-positive mutants (KEC131 to KEC134) was performed (see Materials and Methods). This screen yielded 11 salicin-negative mutants, of which 7 were characterized by semirandom two-step PCR and sequencing (6, 20). All of these seven independent mini-Tn10-Catr insertion mutations mapped in open reading frames identical to c1956, c1957, c1958, and c1959 present in uropathogenic E. coli strain CFT073 (Fig. (Fig.1B).1B). This locus represents a genomic island of six open reading frames. Genes c1959 to c1955, here renamed bgcEFIHA, are likely to constitute an operon, since there are only short intergenic regions between the open reading frames. The deduced amino acid sequences of c1959 to c1955 are similar to those of proteins associated with the utilization of β-d-glucosides as carbon sources in E. coli and other bacteria (Fig. (Fig.11 and Table Table2).2). Open reading frames c1959 (renamed bgcE), c1958 (bgcF), and c1957 (bgcI) putatively encode PTS permease subunits IIB, IIC, and IIA of the family of lactose/cellobiose-specific permeases (21). Gene bgcH (c1956) putatively encodes a porin of the family of maltoporin-like channels, and the deduced protein encoded by the last gene, bgcA (c1955), is highly homologous to phospho-β-d-glucosidases. Gene bgcR (c1960), which maps in an orientation divergent from that of the bgc operon, encodes a transcription factor with an N-terminal GntR-type helix-turn-helix motif (Fig. (Fig.11).

E. coli bgc-encoded proteins and homologues in other species

Among gammaproteobacteria, close homologues to some of the bgc genes were found in the Enterobacteriaceae family members Klebsiella pneumoniae, Enterobacter sp. strain 638, and Citrobacter koseri. However, additional highly similar homologues are present in gram-positive bacteria (Table (Table2).2). The bgcE- and bgcF-encoded proteins are highly similar to permeases encoded by gram-positive bacteria (Table (Table2).2). The bgcI gene-encoded IIA subunit is most similar to the Erwinia bglI-encoded protein. The putative bgcH-encoded porin is 50% identical to the bgl operon-encoded porin. The bgcA-encoded protein is highly similar (with more than 64% identity) to β-glucosidases of Enterobacteriaceae and gram-positive bacteria. However, BgcA is only 60% identical to the E. coli bglA gene, although several of its homologs are misleadingly annotated as bglA. Finally, for bgcR, which encodes the presumptive positive GntR-type transcriptional regulator, no homologues with more than 35% identity were found in the databases.

bgc encodes proteins for the utilization of aryl-β-d-glucosides and cellobiose.

The function of the bgc locus was further analyzed by comparing the phenotypes of the wild-type and mutant bgc strains on β-glucoside indicator plates at 28 and 37°C (Table (Table3).3). Strain KEC93, carrying the wild-type bgc locus, had a negative phenotype on all of the β-glucosides tested, including salicin, arbutin, and cellobiose (Table (Table3).3). The phenotype of the four mutants (KEC131 to KEC134) was positive on all of the β-glucosides tested at 28°C, while at 37°C the bgc* mutants were salicin and cellobiose negative but remained weakly arbutin positive (Table (Table3).3). The bgc::mini-Tn10-Catr insertion mutations mapping in genes bgcE, bgcF, and bgcI, which encode EII permease subunits, were β-glucoside negative on all three sugars (Table (Table3).3). Interestingly, bgcH::mini-Tn10-Catr mutants KEC143 and KEC144 remained weakly arbutin positive at 28°C and at 37°C (Table (Table3).3). The salicin- and cellobiose-negative phenotype suggests that this mutation has a polar effect on the expression of bgcA. Therefore, the arbutin-positive phenotype is likely due to the constitutively expressed bglA gene, which encodes an arbutin-specific phospho-β-d-glucosidase and maps elsewhere in the genome (27, 39). In comparison to the phenotype conferred by the “activated” bgc locus, isogenic strain KEC2, carrying an activated bgl operon, was strongly salicin and arbutin positive after 1 day of incubation at 28°C and at 37°C. However, strain KEC2 was cellobiose negative, as expected, since the bgl operon does not encode cellobiose-specific enzymes (Table (Table3).3). Taken together, the data suggest that the bgc locus encodes proteins for the uptake and hydrolysis of aryl-β-d-glucosides and cellobiose at low temperature. In comparison to the phenotype conferred by the bgl operon, that conferred by the bgc locus is weaker. To further address the role of bgc, we analyzed whether activation of bgc allows growth on minimal plates containing arbutin, salicin, or cellobiose as the sole carbon source. The bgc* mutants (KEC131 to KEC134), but not strain KEC93 carrying the wild-type bgc locus, grew moderately well on arbutin minimal plates at 28°C. On salicin and cellobiose plates, the growth of the bgc* mutants was poor but distinguishable from that of the wild-type bgc strain, supporting the conclusion that the bgc operon encodes proteins for the utilization of aryl-β-d-glucosides and cellobiose.

A point mutation in the promoter region activates the bgc operon.

To characterize the mutation causing spontaneous activation of the bgc operon, the putative regulatory region between bgcE (c1959) and bgcR (c1960) and the bgcR gene were sequenced from parent strain KEC93 and from the four spontaneous β-glucoside-positive mutants (KEC131 to KEC134). The sequence of wild-type bgc strain i484Δbgl was identical to the published sequence of CFT073. Furthermore, all four independent spontaneous mutants carry the identical single point mutation of G/C to A/T at nucleotide position −67 relative to the AUG of bgcE (Fig. (Fig.2).2). This mutation maps within a sequence that matches the consensus binding sequence of the catabolite regulator protein (CRP), where it affects a less conserved position (12) (Fig. (Fig.2).2). To exclude the possibility that the bgc* mutants (KEC131 to KEC134) acquired mutations in addition to the single point mutation within the promoter region, the sequence of the whole bgc operon was determined for parent strain i484 and bgc* mutant KEC133. The sequences of the bgc operon of i484 and its bgc* mutant were identical. In comparison to the sequence of CFT073, one synonymous single nucleotide exchange was detected which changes the 58th valine codon of bgcH from GTG to GTA. This confirms that the single point mutation in the promoter region activates the bgc operon.

FIG. 2.
Activities of the wild-type (wt) bgc and mutant bgc* promoters in E. coli K-12. (A) A single-nucleotide C/G-to-T/A exchange at position −67 relative to the putative translation start of bgcE causes activation of bgc in bgc* mutants ...

Regulation of bgc expression by CRP and BgcR.

To analyze the effect of the single point mutation on the expression of the bgc operon, the regulatory regions of the wild-type bgc locus (from KEC93) and a mutant carrying the point mutation (KEC134) were fused to lacZ (Fig. (Fig.2).2). In addition, a second set of lacZ reporter fusions was constructed which also carried the divergent bgcR gene, which presumably encodes a transcriptional regulator (Fig. (Fig.2).2). The resulting plasmids (with a pACYC origin of replication and thus present at 10 to 15 copies per cell) were used to transform E. coli K-12 strain S541 (Δbgl ΔlacZ), and the β-galactosidase activities directed by these plasmids were determined in cultures grown to the mid-exponential phase (OD600 = 0.5) at 37°C.

Transformants of the plasmid carrying the wild-type bgc regulatory region fused to lacZ, pKEGN46, expressed 2,995 U of β-galactosidase activity when grown in minimal M9 medium with glycerol as a carbon source (Fig. (Fig.2B).2B). The β-galactosidase activity decreased about threefold (1,020 U) when cells were grown in minimal M9 medium with glucose (Fig. (Fig.2B),2B), indicating catabolite regulation of bgc. In an isogenic crp mutant (S996) grown in LB medium, the β-galactosidase activity was sixfold lower (380 U) than that of the wild type (2,465 U) (Fig. (Fig.2B),2B), which confirms catabolite control of the bgc promoter. Similar results were obtained with plasmid pKEGN51, which carries the wild-type bgc regulatory region and, in addition, the divergent bgcR gene (Fig. (Fig.2B,2B, pKEGN51). Transformants of E. coli K-12 with the bgc*-lacZ fusion (pKEGN48) carrying the single point mutation expressed eightfold higher levels of β-galactosidase than the wild-type bgc-lacZ fusions in all of the media tested (Fig. (Fig.2C).2C). Interestingly, in the crp mutant, expression decreased to 318 U, i.e., to levels similar to those directed by the wild-type bgc-lacZ fusion, suggesting that the expression of the mutant remains CRP dependent. The presence of the divergent bgcR gene again had no effect on the expression level (Fig. (Fig.2C).2C). These data show that the single point mutation enhances the activity of the bgc promoter and that the bgc promoter is CRP dependent.

Furthermore, we tested whether the histone-like nucleoid structuring protein H-NS, which is crucial for silencing of the bgl operon (7, 10), has a role in the regulation of bgc. To this end, the β-galactosidase expression directed by the bgc and bgc* reporter plasmids was also determined in an hns::Amp null mutant (S614) isogenic to wild-type K-12 strain S541 (Δbgl ΔlacZ) (Fig. (Fig.2).2). However, a mere 1.5-fold difference in activity compared to that of the wild-type strain was observed, suggesting that H-NS has no significant role in the regulation of bgc.

To further address the role of BgcR, the expression level directed by the bgc-lacZ fusion was determined in E. coli strain KEC93 and its bgc* mutant derivative KEC132 (Fig. (Fig.3).3). To this end, cells were grown to mid-exponential phase at 28°C in minimal M9 glycerol medium and salicin and cellobiose were added where indicated (Fig. (Fig.3).3). The wild-type bgc promoter-lacZ fusion directed the expression of 885 U of β-galactosidase activity, and the addition of salicin or cellobiose had no effect (Fig. (Fig.3A).3A). The activity was similar when the plasmids also carried the bgcR gene (Fig. (Fig.3B).3B). With bgc* mutant KEC132, similar results were obtained when no sugar was added. Interestingly, in the presence of salicin, the expression of β-galactosidase activity increased from 935 to 8,220 U when the bgcR gene was also encoded on the reporter (Fig. (Fig.3B),3B), while the addition of cellobiose had no effect (Fig. (Fig.3B).3B). The bgc*-lacZ fusions again directed about eightfold higher levels of β-galactosidase activity (8,310 U) than the wild-type bgc-lacZ fusions (Fig. (Fig.3C).3C). However, the expression level directed by the bgc*-lacZ fusion also containing the bgcR gene increased only 1.5-fold upon the addition of salicin in the bgc* background (Fig. (Fig.3D).3D). These data suggest that BgcR is a positive regulator of the bgc promoter, which can be induced by salicin in a bgc* background, and that the point mutation enhances the promoter activity so that it is almost independent of activation by BgcR. Further, an expression analysis of cells grown at 37°C was also performed and the n-fold differences in expression level were similar, although the values per se were increased approximately threefold (not shown). This demonstrates that the BgcR-dependent induction by salicin in the bgc* background is independent of the temperature and suggests that expression of the operon may be temperature regulated.

FIG. 3.
Induction of the bgc promoter by salicin in a bgc* background. Wild-type bgc strain i484Δbgl and its bgc* (Bgc-positive) derivative KEC132, as well as bgl+ strain KEC2, were transformed with plasmids carrying Pbgc and ...

The finding that salicin but not cellobiose induces the bgc-lacZ fusion in the bgc* background may suggest that intracellular phospho-salicin induces activation of the bgc-lacZ fusion by BgcR. To test this, the expression levels directed by the bgc-lacZ fusions were also determined in strain KEC2, which is isogenic to KEC93 but carries an activated bgl operon. Strain KEC2 thus expresses the permease EIIBgl when salicin is added to the culture. However, in strain KEC2, expression of the bgc promoter-lacZ fusion containing bgcR was not induced by salicin (Fig. (Fig.3B),3B), suggesting that intracellular phospho-salicin is not the inducing molecule.

Phylogeny of bgc and prevalence in E. coli.

As bgc is present in E. coli CFT073 and i484 but not in E. coli K-12, the locus belongs to the variable gene pool of E. coli. To characterize the prevalence of bgc in E. coli, including Shigella species, we searched the NCBI sequence database and typed a representative collection of E. coli strains (31) for the presence of bgc.

The NCBI nonredundant nucleotide sequence database was searched by BLASTN for bgc by using the nucleotide sequence of the whole bgc locus and its flanking core genome genes from CFT073. This search yielded highly significant hits in E. coli and Shigella genomes. The bgc locus was found in 16 of 23 E. coli and Shigella genome sequences tested which were analyzed in further detail (Fig. (Fig.4).4). In seven genomes, all of the genes of the bgc locus are present; in two genomes, frameshift mutations disrupt a gene of the bgc locus; and in seven genomes (including those of four Shigella strains), the bgc locus is disrupted by insertions and/or deletions (Fig. (Fig.4).4). No bgc homologues were found in the remaining seven genomes, including those of E. coli K-12 strain MG1655, 101-1, 53638, and O157:H7; Shigella flexneri 5 and 301; and Shigella dysenteriae Sd197. Furthermore, the complete bgc locus is also present in the closely related species E. albertii (Fig. (Fig.44).

FIG. 4.
Structure of the bgc locus in E. coli and E. albertii. Schematically shown are the structures of the chromosomal marRAB-bgc-ydeDF region and the marRAB-ydeDF region, respectively, as extracted from genome sequences of E. coli strains (including Shigella ...

For phylogenetic analysis, the sequences of the bgc loci present in E. coli and Shigella sp., as well as in E. albertii, were extracted from the published genomes and aligned. NJ trees were constructed from the alignment, which includes strains with a complete bgc locus, as well as strains which carry deletions within bgc (Fig. (Fig.5A).5A). In parallel, the sequences of seven housekeeping genes which are routinely used for MLST of E. coli (41) were also extracted from the published genomes, concatenated, and aligned and the alignment was used to construct an NJ tree (Fig. (Fig.5B).5B). Comparison of the two trees demonstrates that their topologies are similar. Strains in which bgc is disrupted or absent are indicated in Fig. Fig.5B.5B. The similarity of the tree topologies suggests that bgc was present in a predecessor of E. coli and lost in strains in which it is not present.

FIG. 5.
Comparison of the phylogeny of the bgc operon with the phylogeny of E. coli. The sequence of the bgc operon was extracted from the genome sequences of the strains indicated, aligned, and analyzed by an NJ tree. For comparison, the sequences of the seven ...

In addition, a representative collection of E. coli strains (see Table S1 in the supplemental material) was typed for the presence of the bgc locus by PCR and sequencing with oligonucleotide primers specific for the flanking core genes and the bgc locus. The presence and absence of bgc were then color coded and mapped onto a minimal spanning tree (MSTREE) which represents the population structure of the E. coli collection used (Fig. (Fig.6).6). Strains which carry the bgc locus are indicated in light gray. This demonstrated that bgc is highly prevalent in strains belonging to sequence type (ST) complexes 73 and 95 and related STs all belonging to phylogenetic group B2 of E. coli. The bgc locus is rare in strains of ST complex 10 of phylogenetic group A, and it is present in approximately 50% of the remaining strains. Some strains were found to carry deletions within the bgc locus (see Table S1 in the supplemental material). The collection of E. coli strains which was typed also included three strains (E10083, Z205, and RL325/96) which are presumably representatives of a second E. coli population that diverged early in evolution (1, 31). Interestingly, these strains did not contain bgc genes at the marB-ydeD locus (Fig. (Fig.6).6). The phylogenetic distribution of bgc is compatible with the presence of bgc in an ancestral enterobacterium, followed by multiple loss events along with the diversification of gammaproteobacteria and E. coli into four phylogenetic groups. The presence of mutated bgc loci in several strains further suggests that the process of erosion is still ongoing.

FIG. 6.
Distribution of the bgc locus within a representative collection of E. coli strains represented by a minimal spanning tree. A representative collection of E. coli strains was typed for the presence or absence of bgc by PCR and sequencing. Presence and ...


We have presented evidence for the existence of an additional β-glucoside system (bgc) for the utilization of aryl-β-d-glucosides and cellobiose at low temperature in E. coli. The wild-type bgc system does not allow the utilization of β-glucosides under standard laboratory growth conditions. However, spontaneous Bgc-positive mutants arise after 5 days of incubation at room temperature or 28°C. In these mutants, the activity of the bgc promoter is enhanced and becomes independent of a putative positive regulator encoded upstream of the bgc operon. The bgc locus belongs to the variable gene pool of E. coli. Within the E. coli population, bgc is most prevalent among strains of phylogenetic group B2, which includes extraintestinal pathogens, while it is rare in commensal E. coli strains. The bgc locus is also present in the closely related species E. albertii. Phylogenetic analyses revealed that close homologues of bgc genes are present in the enterobacterial species Klebsiella, Enterobacter, and Citrobacter but also in gram-positive bacteria. Taken together, these findings indicate that bgc genes were acquired during the evolution of the family Enterobacteriaceae, while the absence of bgc in several E. coli strains may be based on gene loss events. The finding that the bgc locus is not present or is disrupted in several strains of E. coli suggests that bgc does not provide an advantage or that its inactivation and loss are positively selected in most of the habitats occupied by E. coli. bgc, which is most preserved among extraintestinal pathogens, may provide an advantage for growth at low temperature outside of the host.

In addition to bgc, other apparently cryptic β-glucoside systems have been characterized in E. coli. These include the well-studied bgl operon, as well as the asc and arbT/glvCBG loci (see introduction). The bgl operon also belongs to the variable gene pool of E. coli, and bgl genes were presumably acquired by horizontal transfer from gram-positive bacteria during the evolution of the family Enterobacteriaceae (31). However, the regulation of the bgc operon and that of the bgl operon are strikingly different. Regulation of bgl includes silencing by H-NS and substrate-specific control by transcriptional antitermination. The bgc operon is presumably not regulated by H-NS, and no sequence motifs important for regulation by antitermination are present within the operon. Interestingly, the bgl operon is likewise most conserved in extraintestinal pathogens and silencing of bgl is less strict in approximately half of the strains belonging to this group (31). This suggests that utilization of β-glucosides may provide an advantage in a habitat occupied by these bacteria.

The bgcEFIHA operon encodes five proteins, including enzyme IIB, IIC, and IIA subunits of a β-glucoside-specific enzyme II PTS permease, a putative β-glucoside-specific porin of the LamB family (BgcH), and phospho-β-d-glucosidase BgcA. BgcA or expression of bgcA is presumably temperature sensitive. Mutants that carry an activated bgc* locus are salicin and cellobiose positive at 28°C, but not at 37°C. However, they are (weakly) arbutin positive at 28°C and 37°C, presumably due to the activity of the constitutively expressed bglA gene, which encodes an arbutin-specific phospho-β-d-glucosidase (27, 39).

BgcR, which is encoded upstream of the bgcEFIHA operon, is likely to be a positive regulator, since the expression of a bgc promoter-lacZ fusion containing bgcR is induced by salicin in a strain background which carries an activated bgc* locus. However, it remains unclear why induction requires bgcR to be encoded in cis to the bgc promoter and how addition of salicin induces expression. The presence of intracellular phospho-salicin by transport of the bgl-encoded enzyme II permease in strain i484 (KEC2) does not lead to induction of the bgc promoter. However, it is possible that a different substrate induces the expression of bgc. Similarly, it is surprising that the four independent bgc* mutants (which were picked as papillae from different colonies) carry the identical point mutation. If this mutation indeed renders the bgc promoter independent of activation by BgcR, other mutations should also activate bgc. It is possible, though, that this single point mutation has the strongest effect and was thus picked four times independently.

The reason for the presence of cryptic β-glucoside systems in E. coli remains unclear. It has been speculated that silencing of bgl is to prevent the utilization of toxic aryl-β-d-glucoside substrates (29). However, β-glucoside systems in other members of the family Enterobacteriaceae such as Erwinia and Klebsiella species, as well as in gram-positive bacteria, are not cryptic, which may indicate that silencing or the cryptic state of β-glucoside systems is specific for the intestinal habitat. It was demonstrated that the bgl operon is maintained in a functional state in strains of phylogenetic group B2 of E. coli, which represents extraintestinal pathogens, while it is absent or subject to erosion in other groups of E. coli strains (31). Interestingly, in strain i484, which belongs to phylogenetic group B2, expression of bgl is induced in vivo (16). This indicates a functional role for β-glucoside utilization in extraintestinal E. coli. Similarly, the bgc operon is most preserved among the strains belonging to phylogenetic group B2. This may indicate that the presence of a locus for the utilization of β-glucosides at low temperature provides an advantage for strains belonging to this group.

Supplementary Material

[Supplemental material]


T.S.S. was funded by the International Graduate School of Genetics and Functional Genomics at the University of Cologne, and G.N. was funded by the Deutsche Forschungsgemeinschaft through the Graduiertenkolleg Genetik zellulärer Systeme.

We thank Vartul Sangal and Mark Achtman for help in the analysis of the clonal distribution of bgc and for discussions.


[down-pointing small open triangle]Published ahead of print on 20 February 2009.

Supplemental material for this article may be found at http://aem.asm.org/.


1. Achtman, M., and M. Wagner. 2008. Microbial diversity and the genetic nature of microbial species. Nat. Rev. Microbiol. 6:431-440. [PubMed]
2. An, C. L., W. J. Lim, S. Y. Hong, E. J. Kim, E. C. Shin, M. K. Kim, J. R. Lee, S. R. Park, J. G. Woo, Y. P. Lim, and H. D. Yun. 2004. Analysis of bgl operon structure and characterization of β-glucosidase from Pectobacterium carotovorum subsp. carotovorum LY34. Biosci. Biotechnol. Biochem. 68:2270-2278. [PubMed]
3. An, C. L., W. J. Lim, S. Y. Hong, E. C. Shin, M. K. Kim, J. R. Lee, S. R. Park, J. G. Woo, Y. P. Lim, and H. D. Yun. 2005. Structural and biochemical analysis of the asc operon encoding 6-phospho-β-glucosidase in Pectobacterium carotovorum subsp. carotovorum LY34. Res. Microbiol. 156:145-153. [PubMed]
4. Bremer, E., P. Gerlach, and A. Middendorf. 1988. Double negative and positive control of tsx expression in Escherichia coli. J. Bacteriol. 170:108-116. [PMC free article] [PubMed]
5. Caramel, A., and K. Schnetz. 1998. Lac and lambda repressor relieve silencing of the Escherichia coli bgl promoter. Activation by alteration of a repressing nucleoprotein complex. J. Mol. Biol. 284:875-883. [PubMed]
6. Chun, K. T., H. J. Edenberg, M. R. Kelley, and M. G. Goebl. 1997. Rapid amplification of uncharacterized transposon-tagged DNA sequences from genomic DNA. Yeast 13:233-240. [PubMed]
7. Defez, R., and M. de Felice. 1981. Cryptic operon for β-glucoside metabolism in Escherichia coli K12: genetic evidence for a regulatory protein. Genetics 97:11-25. [PMC free article] [PubMed]
8. Dersch, P., K. Schmidt, and E. Bremer. 1993. Synthesis of the Escherichia coli K-12 nucleoid-associated DNA-binding protein H-NS is subjected to growth-phase control and autoregulation. Mol. Microbiol. 8:875-889. [PubMed]
9. Dole, S., S. Kühn, and K. Schnetz. 2002. Post-transcriptional enhancement of Escherichia coli bgl operon silencing by limitation of BglG-mediated antitermination at low transcription rates. Mol. Microbiol. 43:217-226. [PubMed]
10. Dole, S., V. Nagarajavel, and K. Schnetz. 2004. The histone-like nucleoid structuring protein H-NS represses the Escherichia coli bgl operon downstream of the promoter. Mol. Microbiol. 52:589-600. [PubMed]
11. El Hassouni, M., B. Henrissat, M. Chippaux, and F. Barras. 1992. Nucleotide sequences of the arb genes, which control β-glucoside utilization in Erwinia chrysanthemi: comparison with the Escherichia coli bgl operon and evidence for a new β-glycohydrolase family including enzymes from eubacteria, archaebacteria, and humans. J. Bacteriol. 174:765-777. [PMC free article] [PubMed]
12. Gunasekera, A., Y. W. Ebright, and R. H. Ebright. 1992. DNA sequence determinants for binding of the Escherichia coli catabolite activator protein. J. Biol. Chem. 267:14713-14720. [PubMed]
13. Hall, B. G., and L. Xu. 1992. Nucleotide sequence, function, activation, and evolution of the cryptic asc operon of Escherichia coli K12. Mol. Biol. Evol. 9:688-706. [PubMed]
14. Hamilton, C. M., M. Aldea, B. K. Washburn, P. Babitzke, and S. R. Kushner. 1989. New method for generating deletions and gene replacements in Escherichia coli. J. Bacteriol. 171:4617-4622. [PMC free article] [PubMed]
14a. Kachroo, A. H., A. K. Kancherla, N. S. Singh, U. Varshney, and S. Mahadevan. 2007. Mutations that alter the regulation of the chb operon of Escherichia coli allow utilization of cellobiose. Mol. Microbiol. 66:1382-1395. [PubMed]
15. Keyhani, N. O., and P. Roseman. 1997. Wild-type Escherichia coli grows on the chitin disaccharide N,N′-diacetylchitobiose, by expressing the cel operon. Proc. Natl. Acad. Sci. USA 94:14367-14371. [PMC free article] [PubMed]
16. Khan, M. A., and R. E. Isaacson. 1998. In vivo expression of the β-glucoside (bgl) operon of Escherichia coli occurs in mouse liver. J. Bacteriol. 180:4746-4749. [PMC free article] [PubMed]
17. Kharat, A. S. 2001. Phenotypic variability of β-glucoside utilization and its correlation to pathogenesis process in a few enteric bacteria. FEMS Microbiol. Lett. 199:241-246. [PubMed]
18. Kricker, M., and B. G. Hall. 1987. Biochemical genetics of the cryptic gene system for cellobiose utilization in Escherichia coli K12. Genetics 115:419-429. [PMC free article] [PubMed]
19. Lopilato, J., and A. Wright. 1990. Mechanisms of activation of the cryptic bgl operon of Escherichia coli K-12, p. 435-444. In K. Drlica and M. Riley (ed.), The bacterial chromosome. American Society for Microbiology, Washington, DC.
20. Madhusudan, S., A. Paukner, Y. Klingen, and K. Schnetz. 2005. Independent regulation of H-NS mediated silencing of the bgl operon at two levels: upstream by BglJ and LeuO and downstream by DnaKJ. Microbiology 151:3349-3359. [PubMed]
21. Marchler-Bauer, A., and S. H. Bryant. 2004. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 32:W327-W331. [PMC free article] [PubMed]
22. Mayer, C., and W. Boos. March 2005. Chapter 3.4.1. Hexose/pentose and hexitol/pentitol metabolism. In A. Böck, R. Curtiss III, J. B. Kaper, F. C. Neidhardt, T. Nyström, K. E. Rudd, and C. L. Squires (ed.), EcoSal—Escherichia coli and Salmonella: cellular and molecular biology. ASM Press, Washington, DC. http://www.ecosal.org.
23. Miller, J. H. 1992. A short course in bacterial genetics. A laboratory manual and handbook for Escherichia coli and related bacteria. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
24. Nagarajavel, V., S. Madhusudan, S. Dole, A. R. Rahmouni, and K. Schnetz. 2007. Repression by binding of H-NS within the transcription unit. J. Biol. Chem. 282:23622-23630. [PubMed]
25. Parker, L. L., and B. G. Hall. 1988. A fourth Escherichia coli gene system with the potential to evolve β-glucoside utilization. Genetics 119:485-490. [PMC free article] [PubMed]
26. Prasad, I., and S. Schaefler. 1974. Regulation of the β-glucoside system in Escherichia coli K-12. J. Bacteriol. 120:638-650. [PMC free article] [PubMed]
27. Prasad, I., B. Young, and S. Schaefler. 1973. Genetic determination of the constitutive biosynthesis of phospho-β-glucosidase A in Escherichia coli K-12. J. Bacteriol. 114:909-915. [PMC free article] [PubMed]
28. Raghunand, T. R., and S. Mahadevan. 2003. The β-glucoside genes of Klebsiella aerogenes: conservation and divergence in relation to the cryptic bgl genes of Escherichia coli. FEMS Microbiol. Lett. 223:267-274. [PubMed]
29. Reynolds, A. E., J. Felton, and A. Wright. 1981. Insertion of DNA activates the cryptic bgl operon of E. coli K12. Nature 293:625-629. [PubMed]
30. Sambrook, J., and D. W. Russell. 2001. Molecular cloning: a laboratory manual, 3rd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
31. Sankar, T. S., G. Neelakanta, V. Sangal, G. Plum, M. Achtman, and K. Schnetz. 2009. Fate of the H-NS-repressed bgl operon in evolution of Escherichia coli. PLoS Genet. 5:e1000405. [PMC free article] [PubMed]
32. Schaefler, S. 1967. Inducible system for the utilization of β-glucosides in Escherichia coli. I. Active transport and utilization of β-glucosides. J. Bacteriol. 93:254-263. [PMC free article] [PubMed]
33. Schaefler, S., and A. Malamy. 1969. Taxonomic investigations on expressed and cryptic phospho-β-glucosidases in Enterobacteriaceae. J. Bacteriol. 99:422-433. [PMC free article] [PubMed]
34. Schnetz, K. 1995. Silencing of Escherichia coli bgl promoter by flanking sequence elements. EMBO J. 14:2545-2550. [PMC free article] [PubMed]
35. Schnetz, K., and B. Rak. 1992. IS5: a mobile enhancer of transcription in Escherichia coli. Proc. Natl. Acad. Sci. USA 89:1244-1248. [PMC free article] [PubMed]
36. Schnetz, K., C. Toloczyki, and B. Rak. 1987. β-Glucoside (bgl) operon of Escherichia coli K-12: nucleotide sequence, genetic organization, and possible evolutionary relationship to regulatory components of two Bacillus subtilis genes. J. Bacteriol. 169:2579-2590. [PMC free article] [PubMed]
37. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596-1599. [PubMed]
38. UniProt Consortium. 2008. The universal protein resource (UniProt). Nucleic Acids Res. 36:D190-D195. [PMC free article] [PubMed]
39. Wilson, G., and C. F. Fox. 1974. The β-glucoside system of Escherichia coli. IV. Purification and properties of phospho-β-glucosidases A and B. J. Biol. Chem. 249:5586-5598. [PubMed]
40. Wilson, G. G., K. Y. K. Young, G. J. Edlin, and W. Konigsberg. 1979. High-frequency generalised transduction by bacteriophage T4. Nature 280:80-82. [PubMed]
41. Wirth, T., D. Falush, R. Lan, F. Colles, P. Mensa, L. H. Wieler, H. Karch, P. R. Reeves, M. C. J. Maiden, H. Ochman, and M. Achtman. 2006. Sex and virulence in Escherichia coli: an evolutionary perspective. Mol. Microbiol. 60:1136-1151. [PMC free article] [PubMed]

Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...