Logo of jbacterPermissionsJournals.ASM.orgJournalJB ArticleJournal InfoAuthorsReviewers
J Bacteriol. Sep 2008; 190(17): 5832–5840.
Published online Jun 27, 2008. doi:  10.1128/JB.00480-08
PMCID: PMC2519529

Genomic O Island 122, Locus for Enterocyte Effacement, and the Evolution of Virulent Verocytotoxin-Producing Escherichia coli[down-pointing small open triangle]


The locus of enterocyte effacement (LEE) and genomic O island 122 (OI-122) are pathogenicity islands in verocytotoxin-producing Escherichia coli (VTEC) serotypes that are associated with outbreaks and serious disease. Composed of three modules, OI-122 may occur as “complete” (with all three modules) or “incomplete” (with one or two modules) in different strains. OI-122 encodes two non-LEE effector (Nle) molecules that are secreted by the LEE type III secretion system, and LEE and OI-122 are cointegrated in some VTEC strains. Thus, they are functionally linked, but little is known about the patterns of acquisition of these codependent islands. To examine this, we conducted a population genetics analysis, using multilocus sequence typing (MLST), with 72 VTEC strains (classified into seropathotypes A to E) and superimposed on the results the LEE and OI-122 contents of these organisms. The wide distribution of LEE and OI-122 modules among MLST clonal groups corroborates the hypothesis that there has been lateral transfer of both pathogenicity islands. Sequence analysis of a pagC-like gene in OI-122 module 1 also revealed two nonsynonymous single-nucleotide polymorphisms that could help discriminate a subset of seropathotype C strains and determine the presence of the LEE. A nonsense mutation was found in this gene in five less virulent strains, consistent with a decaying or inactive gene. The modular nature of OI-122 could be explained by the acquisition of modules by lateral transfer, either singly or as a group, and by degeneration of genes within modules. Correlations between clonal group, seropathotype, and LEE and OI-122 content provide insight into the role of genomic islands in VTEC evolution.

Verocytotoxin-producing Escherichia coli (VTEC) and Shiga toxin-producing E. coli (STEC) are emerging zoonotic pathogens consisting of multiple serotypes, over 200 of which have been isolated from cases of human disease (48). Human infection by some VTEC serotypes, notably O157:H7, is associated with significant outbreaks and may lead to serious complications, such as hemorrhagic colitis and the hemolytic-uremic syndrome (HUS) (23, 25, 34). Karmali et al. (24) have classified VTEC into five seropathotype groups based on the relative frequency with which the serotypes are associated with serious and epidemic human disease (Table (Table1).1). There is a strong association between VTEC seropathotypes A and B that cause serious or epidemic disease and the presence of two genomic islands, the locus of enterocyte effacement (LEE), which is associated with the characteristic “attaching and effacing” lesions (20, 34), and genomic O island 122 (OI-122) (24).

Classification of VTEC serotypes into five seropathotype groupsa

All of the virulence factors necessary for the formation of the attaching and effacing lesions in VTEC are encoded by the LEE pathogenicity island (18, 20, 34, 47), which encodes the structural, accessory, and effector molecules of this type III secretion system (4, 12, 13, 18, 19). The LEE of VTEC serotype O157:H7 reference strain EDL 933, which is ~43.4 kb long, contains 41 open reading frames which are organized in five polycistronic operons (LEE 1, LEE 2, LEE 3, LEE 5, and LEE 4) (34). Of particular interest in this study was LEE 5, which contains the eae gene, which encodes the outer membrane adhesin intimin (34). This operon also contains genes that encode the translocated intimin receptor known as Tir (27) or EspE (7) and the Tir chaperone, CesT (1, 10).

OI-122 is a 23-kb pathogenicity island in O157:H7 strain EDL 933 which consists of three distinct modules separated by mobile genetic elements (Fig. (Fig.1)1) (24, 38). The first module encodes Z4321, a gene product with 46% homology to the phoP-activated gene C product (PagC) that enables survival in macrophages of Salmonella enterica serovar Typhimurium (30, 31, 36) (Fig. (Fig.1).1). This module is present in strains ranging from strains carrying a complete OI-122 with all three modules to incomplete strains carrying only this module. Module 2 carries the Z4326 (sen) gene, whose product is 39% homologous to Shigella enterotoxin, and genes encoding two proteins, Z4328 and Z4329, with 89 and 86% sequence homology to non-LEE-encoded (Nle) effectors of Citrobacter rodentium, NleB and NleE, respectively (8, 24). The third module encodes Z4332 and Z4333, which are enterohemorrhagic E. coli (EHEC) factors for adherence (Efa1 and Efa2).

FIG. 1.
Modular components of OI-122. ISA, insertion sequence-associated elements (or putative transposases) between the three modules. The PCR gene markers used to detect the presence of modules are indicated by bold type and blue.

OI-122 and LEE are functionally related; the LEE-encoded type III secretory apparatus is required for secretion of the OI-122 non-LEE-encoded effectors NleB and NleE encoded in module 2 (8, 26). Multiple strains of a given serotype have conserved patterns in the modular arrangement of OI-122 genes (24). It has been proposed that the transposon-like independent mobile elements of OI-122 are acquired or lost in a modular manner (46). The evidence which supports this proposal includes integration of OI-122 and LEE into each other in various forms. While LEE is 0.7 Mb downstream of a complete OI-122 in EDL 933 (33, 38), mosaics of these two islands have been identified by several groups. Shen et al. identified a mosaic island in which OI-122 (with only module 1) was embedded in OI-48 in seropathotype C O113:H21 strain CL3 (41). In the VTEC O26:H11-related RDEC strains, LEE is immediately upstream of OI-122 (modules 2 and 3), cointegrated as a 58-kb mosaic island (33, 44). A similar structure was found in EHEC serotype O103:H2 strains, where LEE is 43 kb downstream of OI-122 (modules 2 and 3), again physically linked in this 111-kb hybrid island (22). Given that the presence of both LEE and OI-122 is associated with the most pathogenic VTEC strains, the objective of this study was to investigate the patterns of acquisition of these genomic islands in VTEC. This was done by overlaying the distribution of LEE and OI-122 and seropathotypes on VTEC strain classification by multilocus sequence typing (MLST).

MLST was chosen to shed light on the timelines for acquisition of these genomic islands in the evolutionary history of this pathogen by using inferred relationships among the collection of seropathotype strains. In this study, the VTEC strain group was analyzed using the MLST scheme developed at the STEC Center (http://www.shigatox.net/stec/mlst-new/index.html). Using this method, phylogenetic relationships were inferred based on differences among “core” genomes deduced from an analysis of seven highly conserved housekeeping genes (see Table S1 in the supplemental material). These carefully selected genes are presumably inherited vertically rather than horizontally and are subject to selective pressures, so there is slow, continual acquisition of random nucleotide changes (49). Studies have indicated that housekeeping genes diverge at a rate that reflects the overall rate of genome divergence due to vertical and horizontal transfer events, as well as genome reduction (29, 42, 49). The position of each node (strain) on a tree based on MLST data inherently reflects this genome diversity and provides a visual indication of how the genomes evolved relative to one another. The MLST technique was used to generate clustering patterns for 72 VTEC strains, which were then analyzed to determine correlations between clonal groups, seropathotypes, and genomic island content.


Bacterial strains.

Seventy-two VTEC strains were used in this study (Fig. (Fig.2).2). The housekeeping gene sequences were obtained from the E. coli MLST (EcMLST) database for four other E. coli reference (ECOR) strains, including the O55:H7 progenitor strain, ECOR37, and for strain K-12, which was included as a non-VTEC reference strain. Housekeeping gene sequences for the commensal organism E. coli HS (accession no. AAJY00000000), uropathogenic E. coli strain CFTO73 (accession no. AE014075), Shigella flexneri 2a (accession no. AE005674), and S. enterica serovar Typhimurium LT2 (accession no. AE006468) were retrieved from the NCBI GenBank database. These non-VTEC isolates were added to give the MLST phylogenetic tree some additional structure and to permit us to draw further conclusions based on knowledge on their evolutionary relatedness to VTEC. Since it is known that S. enterica serovar Typhimurium diverged from E. coli ~108 years ago (37), strain LT2 was selected for use as the outgroup (root) in the MLST tree.

FIG. 2.
Overlay of genomic island, clonal group, and seropathotype distributions relative to the MLST-inferred phylogenies (based on seven housekeeping genes) for 72 VTEC isolates. The box indicates the strains with a premature stop codon in a pagC-like gene, ...


The housekeeping genes were PCR amplified from genomic DNA isolated from each of the 72 VTEC strains and K-12 using a Qiagen DNeasy kit (Qiagen Inc., Mississauga, Canada). The primers used for PCR amplification, as well as sequencing, which were previously designed at the STEC Center (http://www.shigatox.net/stec/mlst-new/mlst_primers.html), are shown in Table S2 in the supplemental material. The PCR amplification conditions were 35 cycles of 92°C for 1 min, 58°C for 1 min, and 72°C for 30 s, with an initial denaturing step of 94°C for 10 min for aspC, clpX, fadD, and lysP. For icdA, mdh, and uidA, a shorter extension time (15 s) at 72°C for 40 cycles was used. AmpliTaq Gold with buffer II was used (Applied Biosystems, Foster City, CA) for increased specificity.

The amplicons were purified using a Qiagen PCR purification kit (Qiagen Inc., Mississauga, Canada) and were sequenced using a DYEnamic ET terminator cycle sequencing kit and a MegaBACE 500 automated DNA sequencer (Amersham Biosciences UK Ltd., Buckinghamshire, England). Sequencing of amplified fragments was done in both directions and in duplicate, so that for each gene a consensus sequence was derived from four sequence reads using Discovery Studio Gene software (Accelrys Software Inc., San Diego, CA). The gene consensus sequences were aligned using ClustalX (45). For each strain, the seven-gene consensus sequence was concatenated using Molecular Evolutionary Genetics Analysis (MEGA), version 3.1 (28), in the order aspC-clpX-fadD-icdA-lysP-mdh-uidA, giving an MLST “supergene” sequence that was 3,558 bp long (see Table S2 in the supplemental material).

Sequencing of the Z4321 (pagC-like) gene.

The Z4321 locus was PCR amplified using forward primer 5′-ATGAGTGGTTCAAGACTGG-3′ and reverse primer 5′-CCAACTCCAACAGTAAATCC-3′, yielding a 521-bp amplicon (24). This amplicon was sequenced as described above, and the sequence data were cropped to 399 bp prior to ClustalW alignment with Discovery Studio Gene software.

Data analysis.

The “supergene” sequence was used to generate a dendrogram using MEGA software (neighbor-joining algorithm with the Tajima-Nei model of genetic distance and bootstrapping of 1,000 replicates). The tree was subsequently examined for patterns in the genomic island contents of strains in the context of their seropathotype and clonal group designations.

Correlations were also made between the presence of complete and incomplete OI-122 and LEE in the context of the strains' propensity to cause disease. Genes spanning all three modules of OI-122 (Z4321, Z4326, Z4332, and Z4333) (Fig. (Fig.1)1) and the eae locus of the LEE (Z5110) were previously amplified for the seropathotype collection (24). The primers used for PCR amplification were described by Karmali et al. (24).

A protein tree of Z4321 was also made using MEGA (unweighted-pair group method using average linkages algorithm with bootstrapping of 1,000 replicates) to determine subgroups of strains based on amino acid differences. The Nei-Gojobori procedure (28) was also performed using MEGA to evaluate the substitution rates.


Seropathotype, clonal group, and genomic island distributions in the MLST tree.

Among the VTEC isolates, seropathotype A strains cluster as a distinct MLST clonal group carrying both LEE and a complete OI-122 (Fig. (Fig.2).2). This O157 group corresponds to MLST clonal group 11 (EHEC 1) and is closely related to the putative ancestral E. coli serogroup O55:H7 strain (2, 3, 11). In the non-O157 branch, the LEE content and the OI-122 forms vary within clonal groups. Seropathotype B strains are dispersed in four different MLST clusters, including EHEC 2, EHEC-O121, and STEC 2. The fourth cluster, which is less closely related to the other three clusters, all of which share a more recent common ancestor, consists of O145:NM strains. Seropathotype B strains are positive for LEE and possess either a complete or incomplete OI-122 (Fig. (Fig.2).2). Most seropathotype C, D, and E strains occur in a wide diversity of clonal groups, and these strains include members of MLST groups STEC 1, STEC 2, and EHEC-O121. Interestingly, only module 1 is present when LEE is absent (Fig. (Fig.2).2). This may be due to the absence of type III effectors (LEE associated) in this module. In contrast, module 2, which encodes effectors secreted by LEE (NleB and NleE) correlates highly with LEE. Multiple strains of a serotype usually have the same pattern of gene deletions or insertions in OI-122; for example, incomplete OI-122 patterns are conserved in four strains belonging to serotype O113:H21 (LEE negative) and three strains belonging to serotype O26:H11 (LEE positive).

S. enterica serovar Typhimurium LT2 differed significantly at the genetic level from the VTEC isolates and other strains and so constitutes the outgroup node or root of the tree. S. flexneri strain 2a clustered together with uropathogenic E. coli strain CFT073 and two seropathotype D serotype O117:H7 strains. Reference strain K-12 clustered most closely with the other commensal strain, strain HS, and with low-virulence strains (seropathotype D and E strains and environmental isolates ECOR-01 and ECOR-04).

Sequencing of the pagC-like Z4321 gene.

Overall, 11 single-nucleotide polymorphisms (SNPs) and one indel (insertion of adenine nucleotide) were found in the pagC-like gene of 43 VTEC strains positive for this locus (see Table S3 in the supplemental material). The results for three O145:NM strains and one O119:H25 strain had discrepancies with previous results for the pagC-like locus (24) due to the presence of weak PCR bands. Of greatest interest in the SNP analysis were the mutations that led to amino acid substitutions in the encoded protein. Figure Figure33 shows the interrelatedness of Z4321-postive strains and their groupings based on differences in protein sequence. Sequence analysis of the pagC-like Z4321 locus revealed a nonsense mutation in five strains, three seropathotype D strains (human serotype O103:H25 and O119:H25 strains) and two seropathotype E strains (bovine O98:H25 and O84:NM strains), as shown in Fig. Fig.2.2. Strains with this decayed OI-122 module belong to a single MLST clonal group, group 20, with high tree branch reliability (100% bootstrap support). They are grouped together despite variations in host (human and bovine), serotype (somatic antigens O103, O119, and O98 with flagellar antigen H25 and O84:NM), seropathotype (seropathotypes D and E), and genomic island content (LEE positive with incomplete and complete forms of OI-122). The mutation in the Z4321 gene is the result of insertion of an A at nucleotide 388, which led to a shift in the reading frame and changed the last two codons of the protein product before the premature stop (Fig. (Fig.3).3). Thus, this outer membrane protein is truncated at the end of the third transmembrane loop and lacks the fourth and final loop of PagC. The nucleotide sequence downstream of the stop codon is conserved among the five strains with this mutation. Among the five strains with this truncated gene product are the only human seropathotype D strains in the VTEC collection in which both LEE and OI-122 module 1 are present. These less virulent (seropathotype D) human isolates have modules 1 and 2 but lacked module 3 (serotype O119:H25) or have a complete OI-122 (serotype O103:H25).

FIG. 3.
Phylogenetic tree based on PagC-like Z4321 protein in 43 VTEC strains. SNPs in Z4321 in S. enterica serovar Typhimurium were too numerous to display (TNTD). For an explanation of the symbols, see the legend to Fig. Fig.2.2. n/a, gene was not analyzed. ...

Furthermore, strains positive for the pagC-like locus fall into two groups, which are distinguishable by the presence of a nonsynonymous SNP at nucleotide 207 of this gene (Fig. (Fig.3).3). One group consists of eae (LEE)-positive strains that have His at the corresponding position in the Z4321 protein. The LEE-positive strains that have pagC have an incomplete OI-122 (modules 1 and 2) or a complete OI-122 (all three modules). All of these strains (human seropathotypes A, B, and C) have allelic variant 1 of pagC. The second group of Z4321-positive strains (human seropathotype C and bovine seropathotypes D and E) lack the LEE, have a Gln codon at bp 207, and do not carry module 2 or 3. These LEE-negative strains exclusively harboring pagC have four SNPs (allelic variant 2) and seven SNPs (allelic variant 3), respectively, compared to the variant 1 allele (see Table S3 in the supplemental material). Note that the LEE-positive branch with the premature stop codon in the pagC-like locus differs only by a single adenine insertion at nucleotide 388 (human seropathotype D and bovine seropathotype E) from allelic variant 1 and comprises allelic variant 4.

There is a division among seropathotype C strains with flagellar H21 antigen (O91:H21 and 0104:H21 versus O113:H21) as a result of a mutation at nucleotide 119 (Fig. (Fig.3).3). O113:H21 strains encode Tyr (TCT) at this location, and this amino acid is unique to this group compared with all the other Z4321-positive VTEC strains. The O113:H21 strains make up STEC 2 clonal group 30, while the O91:H21 and O104:H21 strains, which encode Ser (TAT), constitute STEC 1 clonal groups 34 and 18, respectively.

G+C content.

It was observed that the average G+C content of the housekeeping genes (range, 51.5 to 54.0% [see Table S2 in the supplemental material]) corresponds well with the overall host genome base composition (E. coli K-12 [accession no. NC_000913], 50.8%; EDL 933 [accession no. NC_002655], 50.4%). The average G+C content of the pagC-like Z4321 gene is 40.0%, which is low compared to that of the overall genome, as expected for a gene on a pathogenicity island (17).

Substitution rates.

The Nei-Gojobori procedure (28) was performed using MEGA to evaluate the substitution rates for individual housekeeping genes and the concatenated supergene, as well as the pagC-like gene. More specifically, the numbers of synonymous (pS) or silent and nonsynonymous (pN) substitutions leading to differences in amino acid sequence per site were estimated for the housekeeping genes and the pagC-like Z4321 locus. The assumption is that the rates of evolution for a site are expected to be equal for neutral selection (pS/pN = 1), whereas positive (diversifying) selection occurs when pN > pS and negative (purifying) selection occurs when pS > pN (35). For each of the housekeeping genes and the concatenated supergene, the rate of synonymous mutation is higher than the rate of nonsynonymous mutation (pS> pN), which implies that there is purifying selection (Table (Table2;2; see Table S4 in the supplemental material), as expected. In the housekeeping genes, the rate of synonymous mutation is approximately 42-fold higher than the rate of nonsynonymous mutation (Table (Table2).2). The most divergent housekeeping gene (i.e., the least conserved gene) with the highest rate of nonsynonymous substitution is uidA (see Table S4 in the supplemental material). The rate of nonsynonymous substitution was 20-fold higher in pagC than in the housekeeping genes, which is consistent with a divergent gene; further, the rate of synonymous substitution was 1.6 times lower.

Rates of synonymous and nonsynonymous substitution in Z4321 and in the MLST supergene sequence based on Nei-Gojobori and Jukes-Cantor analysis for the overall VTEC collection and for seropathotypes

The rates of substitution were also analyzed separately for each of the five seropathotypes (seropathotypes A through E) to test the hypothesis that the individual groups evolve with different substitution rates. There were general increases in the rates of both synonymous substitution and nonsynonymous substitution in the less virulent seropathotypes in both the housekeeping gene (supergene) and pagC-like gene sequences (Table (Table2).2). In seropathotype D and E strains there was a significantly higher rate of mutation (especially pN) in pagC than in the supergene (P < 0.05, Student-Newman-Keuls posttest). The pS/pN value is closer to 1 for pagC in these strains, consistent with a shift toward neutral selection after the introduction of the stop codon (gene inactivation).


The predicted substitution rates and the G+C content of the housekeeping genes compared to that of the overall genome confirm that these genes are being evolutionarily conserved in VTEC genomes, thereby validating the use of MLST for defining evolutionarily related clonal groups. These observations confirm our assumption that the selected housekeeping genes are vertically acquired and so provide a snapshot of the divergence that occurs in the overall genome (49).

E. coli O157:H7 strains associated with outbreaks and severe epidemicity have been shown to represent a single phylogenetic branch when they are grouped by MLST (using seven housekeeping genes), comprising 100% of the seropathotype A strains (40, 50, 51). It has been postulated that over the last 50 years, the pathogenic O157 lineage has evolved from an enteropathogenic E. coli O55:H7 group with the acquisition of verotoxin-converting phages and an O157 rfb (O-antigen subunit) gene cluster (3). This finding was corroborated in the current study, where all of the O157 strains clustered as a single group whose nearest neighbor was the enteropathogenic E. coli O55 strain (Fig. (Fig.2).2). The O157 group of strains and their O55 ancestor have either converged or diverged from non-O157 VTEC at some point in their evolutionary history. This separation may coincide with the evolutionary split of O157 and K-12, which occurred 4.5 million years ago (21). S. flexneri 2a, which mapped among the outliers from the major non-O157 cluster, is more closely related to K-12 than to EDL 933 (O157 seropathotype A) (21), and this was confirmed by the finding that these organisms share a more recent common ancestral node in the tree. The facts that uidA was found to be the least conserved of the housekeeping genes and that it was absent only in S. enterica serovar Typhimurium LT2, which is the outlier strain in the MLST tree, reaffirm the structure of the tree. There have been more evolutionary splits from Salmonella in the non-O157 clusters than in the O157 cluster.

The highly branching non-O157 group reflects a high degree of genetic rearrangement compared to the O157 cluster. It can be postulated that losing genetic factors and moving from virulent to less virulent may give new non-O157 variants a selective advantage in surviving and/or in contributing to pathogenesis during VTEC infection. Some of the internal non-O157 branches in the tree may represent the fastest-evolving strains (including seropathotype D and E strains with a nonsense mutation in pagC-like gene Z4321), and a lot of variation in genomic island content has been observed within these closely related subclusters. High substitution rates in non-O157 (seropathotype C, D, and E) strains corroborate this observation. Examination of the occurrence of the LEE and components of OI-122 in widely divergent MLST clonal groups has provided striking evidence of horizontal transfer of chromosomal genes and pathogenicity islands. Furthermore, this study, using OI-122 as an example, provides novel insights into the acquisition and fate of island components.

Evidence of horizontal gene transfer among non-O157 strains.

The O157 EHEC 1 group represents the only VTEC group for which there is a direct correlation between seropathotype (seropathotype A) and genomic island content (LEE positive with complete OI-122). Otherwise, among the non-O157 VTEC strains, it is clear from the MLST clustering patterns that seropathotype and genomic island distribution are not clonally restricted. In fact, seropathotypes are widely dispersed throughout the tree and are more widely dispersed with decreasing level of epidemicity (seropathotype A clusters on one branch, seropathotype B clusters on four branches, seropathotype C clusters on five branches, seropathotype D clusters on nine branches, and seropathotype E clusters on nine branches [Fig. [Fig.2]).2]). LEE and the various OI-122 forms (complete, incomplete, or absent) are widely distributed in different lineages (clonal groups) and seropathotypes. Modular components of OI-122 that are variably present or absent on a branch containing closely related strains likely were obtained from less closely related strains or species via horizontal gene transfer. For example, examination of the MLST tree around seropathotype C, D, and E strains shows that there is a mixture of LEE-positive and -negative strains in a branch. This scattered distribution of eae genes in the MLST tree is characteristic of a horizontal transfer event when a set of genes has been introduced into a lineage (9). Among seropathotype B through E strains, OI-122 molecules and their gene content are similarly scattered throughout the tree. For example, in branches where all strains belong to the same serotype and have the same incomplete form of OI-122, it is likely that components were acquired in a modular manner and became stabilized in the genome. The observations for LEE and OI-122 described above support the notion that the virulence genes comprising these gene cassettes have been and continue to be horizontally transferred across lineages. The wide MLST clonal distribution of these two islands and the lack of association between seropathotype, genomic island content, OI-122 module content, and MLST clustering patterns are also indicative of horizontal transfer among strains (Fig. (Fig.22).

pagC-like Z4321 deletion mutants in less virulent strains.

The evidence of decay of OI-122 elements in module 1 (Z4321) in five strains that belong to seropathotypes D and E correlates with the apparent reduced virulence of these LEE-positive strains. It also strongly indicates that there has been horizontal gene transfer since this mutated genetic element is shared by strains whose seropathotype and genomic island profiles and hosts differ but the strains are closely related phylogenetically (15). The sequence flanking the indel (an adenine insertion), particularly downstream of the premature stop codon, which no longer encodes a functional protein product, is conserved in these different strains. Based on the minimal assumption of evolution, this insertion was introduced once at some point in the evolutionary history of this collection and was passed horizontally among the strains (15). The OI-122 modular patterns may reflect horizontal acquisition of one or more modules independently or modular decay following transfer of a complete OI-122 (Fig. (Fig.3).3). A correspondingly high rate of synonymous and nonsynonymous (detrimental) substitutions in the Z4321 gene in these less virulent seropathotypes is also consistent with a decaying or inactive gene. Functional protein studies with Yersinia and Salmonella involving closely related Ail and Rck proteins indicated that the fourth extracellular loop (absent in the truncated Z4321 protein) is not associated with adhesion, invasion, or serum resistance phenotypes (5, 32, 36). The third loop (full length in the truncated Z4321 protein) has been shown to confer virulence properties in Rck (5). In the future, functional assays may be performed to assess the impact of this mutation on the Z4321 protein in VTEC. Wickham et al. showed that there is a significant association between the presence of a combination of pagC and sen (ent), nleB, and efa-1/lifA and HUS after infection in non-O157 E. coli (46). On its own, the pagC-like gene is associated with HUS but not with outbreaks (46). It is interesting that the seropathotype E strains in this study which had the mutated pagC gene were of bovine origin and also did not contain efa-1 (Fig. (Fig.2).2). It has been proposed that the additive effect of these two genes contributes significantly to causing HUS (6, 46). While the pagC locus may contribute to pathogenesis in more virulent VTEC, pseudogenization may have hampered its activity and given rise to these less virulent variants. The observation that human strains without this deleterious mutation in pagC (on a complete OI-122 with LEE present) are seropathotype A, B, or C strains and strains with the truncated gene (on a complete or incomplete OI-122 with LEE) are seropathotype D strains supports this theory.

SNPs in Z4321 useful for differentiation of O113:H21 and the presence of LEE.

Human seropathotype C strains with flagellar H21 antigen were originally classified in one clonal group (http://www.shigatox.net/cgi-bin/stec/clonal). Data from this study show that these strains belong to different clonal groups (Fig. (Fig.2),2), and this was corroborated by a SNP in Z4321 (A → C at bp 119) that results in an O113:H21-specific Tyr residue (in clonal group 30) instead of Ser, which is found in all the other serotypes tested, including O91:H21 (clonal group 34) and O104:H21 (clonal group 18). While they share a common flagellar H21 antigen and have the same OI-122 and LEE profiles (LEE negative and OI-122 module 1 only [Fig. [Fig.2]),2]), these groups split at some point in their evolutionary history. There may be other genomic differences between these groups, but targeting this SNP may be a quick way to differentiate between the H21 clonal groups and to screen for the O113:H21 serotype.

A second nonsynonymous SNP at nucleotide 207 of the Z4321 gene allowed us to predict the presence of LEE because strains having His at the corresponding position harbor eae, while strains with Gln lack eae. An interesting observation is that only the strains that carry the pagC-like gene exclusively (modules 2 and 3 are not present) both have this nonsynonymous substitution (His → Gln) and lack LEE. Strains with this property include O113:H21 strain CL3, in which Z4321 is part of a mosaic island cointegrated with OI-48 (41). Screening for a marker, Z1640::S1, that is indicative of this hybrid island indicated that other serotypes in the VTEC collection with this characteristic include O156:NM, O171:H2, O7:H4, O88:H25, and O91:H21 (41). It is unclear whether the pagC-like gene first appeared in strains such as O157:H7 strain EDL 933 as part of a complete OI-122 or as part of the OI-122::OI-48 mosaic island, as observed in O113:H21 strain CL3. The pagC alleles (alleles 1 and 4) of LEE-positive strains have four to eight nucleotide differences compared with the allelic variants (alleles 2 and 3) of the LEE-negative lineages (see Table S3 in the supplemental material). The pagC-like alleles may have been exchanged between the LEE-positive and -negative lineages at some point, along with the acquisition of SNPs. Given that there are more than a few nucleotide differences between these genes, the possibility that the genes may also have arisen from a separate ancestor cannot be overlooked.

Concluding remarks.

Bacterial evolution is driven by the need to achieve optimal “fitness,” a concept that refers to attributes that enhance the survival, spread, and/or transmission of an organism within a specific ecological niche (16, 39). Horizontal gene transfer and gene degradation provide mechanisms for rapid adaptation to changing ecological circumstances or for acquiring optimal fitness so that an organism can survive and flourish under such circumstances (16). The evolutionary advantage of acquiring genomic islands over acquiring smaller genetic elements is that a large number of genes encoding many complementary functions may be transferred en bloc to the recipient organism, a process that may result in “evolution in quantum leaps” (14); one example of this is the acquisition of a type III secretion system which is encoded by LEE (17). On the other hand, a minor environmental change may not require acquisition of all the genetic material present in a genomic island, and the transfer of smaller elements, such as plasmids or transposons, may be more efficient. Considering that we did not explore the LEE in this study to the same extent as OI-122, further analysis should shed more light on the interplay of these genomic islands. OI-122 has three modules, each consisting of genes associated with mobile genetic elements, including transposase genes. One or more of these elements may thus be transposons, a concept supported by the occurrence of one, two, or three OI-122 modules in individual strains. Transposons are typically associated with the transfer of antimicrobial resistance genes under the selective pressure of antibiotics. However, transposons containing genes that encode catabolic functions have also been described, and their presence may be selected by specific substrates (43). Environmental selective factors that could be involved in selecting specific OI-122 modules remain to be investigated. Knowledge about the ecological determinants of the presence, absence, or decay of specific OI-122 modules could provide new insights into the origin of pathogenic clones expressing specific modular patterns.

This population genetics study provided new insights about two genomic islands in the evolution of pathogenic VTEC. The results support the hypothesis that genomic islands in VTEC are horizontally acquired and that some of them, like OI-122, are likely acquired in a modular manner. It appears that the less virulent VTEC strains have experienced a loss of genomic island components. Further work can address the question of what role the horizontally acquired islands play in the emergence of new pathogens.

Supplementary Material

[Supplemental material]


We thank Shelley Frost for technical assistance.

This research was supported by the Public Health Agency of Canada, as well as by a Canadian Institutes of Health Research Food and Water Safety grant and by operating grants from the Canadian Institutes of Health Research and the Howard Hughes Medical Institute.


[down-pointing small open triangle]Published ahead of print on 27 June 2008.

Supplemental material for this article may be found at http://jb.asm.org/.


1. Abe, A., M. de Grado, R. A. Pfuetzner, C. Sanchez-Sanmartin, R. Devinney, J. L. Puente, N. C. Strynadka, and B. B. Finlay. 1999. Enteropathogenic Escherichia coli translocated intimin receptor, Tir, requires a specific chaperone for stable secretion. Mol. Microbiol. 331162-1175. [PubMed]
2. Allen, N. L., A. C. Hilton, R. Betts, and C. W. Penn. 2001. Use of representational difference analysis to identify Escherichia coli O157-specific DNA sequences. FEMS Microbiol. Lett. 197195-201. [PubMed]
3. Brussow, H., C. Canchaya, and W. D. Hardt. 2004. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol. Mol. Biol. Rev. 68560-602. [PMC free article] [PubMed]
4. Chan, K. N., A. D. Phillips, S. Knutton, H. R. Smith, and J. A. Walker-Smith. 1994. Enteroaggregative Escherichia coli: another cause of acute and chronic diarrhoea in England? J. Pediatr. Gastroenterol. Nutr. 1887-91. [PubMed]
5. Cirillo, D. M., E. J. Heffernan, L. Wu, J. Harwood, J. Fierer, and D. G. Guiney. 1996. Identification of a domain in Rck, a product of the Salmonella typhimurium virulence plasmid, required for both serum resistance and cell invasion. Infect. Immun. 642019-2023. [PMC free article] [PubMed]
6. Coombes, B. K., M. E. Wickham, M. Mascarenhas, S. Gruenheid, B. B. Finlay, and M. A. Karmali. 2008. Molecular analysis as an aid to assess the public health risk of non-O157 Shiga toxin-producing Escherichia coli. Appl. Environ. Microbiol. 742153-2160. [PMC free article] [PubMed]
7. Deibel, C., S. Kramer, T. Chakraborty, and F. Ebel. 1998. EspE, a novel secreted protein of attaching and effacing bacteria, is directly translocated into infected host cells, where it appears as a tyrosine-phosphorylated 90 kDa protein. Mol. Microbiol. 28463-474. [PubMed]
8. Deng, W., J. L. Puente, S. Gruenheid, Y. Li, B. A. Vallance, A. Vazquez, J. Barba, J. A. Ibarra, P. O'Donnell, P. Metalnikov, K. Ashman, S. Lee, D. Goode, T. Pawson, and B. B. Finlay. 2004. Dissecting virulence: systematic and functional analyses of a pathogenicity island. Proc. Natl. Acad. Sci. USA 1013597-3602. [PMC free article] [PubMed]
9. Dutta, C., and A. Pan. 2002. Horizontal gene transfer and bacterial diversity. J. Biosci. 2727-33. [PubMed]
10. Elliott, S. J., S. W. Hutcheson, M. S. Dubois, J. L. Mellies, L. A. Wainwright, M. Batchelor, G. Frankel, S. Knutton, and J. B. Kaper. 1999. Identification of CesT, a chaperone for the type III secretion of Tir in enteropathogenic Escherichia coli. Mol. Microbiol. 331176-1189. [PubMed]
11. Feng, P., K. A. Lampel, H. Karch, and T. S. Whittam. 1998. Genotypic and phenotypic changes in the emergence of Escherichia coli O157:H7. J. Infect. Dis. 1771750-1753. [PubMed]
12. Foubister, V., I. Rosenshine, M. S. Donnenberg, and B. B. Finlay. 1994. The eaeB gene of enteropathogenic Escherichia coli is necessary for signal transduction in epithelial cells. Infect. Immun. 623038-3040. [PMC free article] [PubMed]
13. Goosney, D. L., R. DeVinney, and B. B. Finlay. 2001. Recruitment of cytoskeletal and signaling proteins to enteropathogenic and enterohemorrhagic Escherichia coli pedestals. Infect. Immun. 693315-3322. [PMC free article] [PubMed]
14. Groisman, E. A., and H. Ochman. 1996. Pathogenicity islands: bacterial evolution in quantum leaps. Cell 87791-794. [PubMed]
15. Gupta, R. S. 1998. Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol. Mol. Biol. Rev. 621435-1491. [PMC free article] [PubMed]
16. Hacker, J., and E. Carniel. 2001. Ecological fitness, genomic islands and bacterial pathogenicity. A Darwinian view of the evolution of microbes. EMBO Rep. 2376-381. [PMC free article] [PubMed]
17. Hacker, J., and J. B. Kaper. 2000. Pathogenicity islands and the evolution of microbes. Annu. Rev. Microbiol. 54641-679. [PubMed]
18. Jarvis, K. G., J. A. Giron, A. E. Jerse, T. K. McDaniel, M. S. Donnenberg, and J. B. Kaper. 1995. Enteropathogenic Escherichia coli contains a putative type III secretion system necessary for the export of proteins involved in attaching and effacing lesion formation. Proc. Natl. Acad. Sci. USA 927996-8000. [PMC free article] [PubMed]
19. Jarvis, K. G., and J. B. Kaper. 1996. Secretion of extracellular proteins by enterohemorrhagic Escherichia coli via a putative type III secretion system. Infect. Immun. 644826-4829. [PMC free article] [PubMed]
20. Jerse, A. E., J. Yu, B. D. Tall, and J. B. Kaper. 1990. A genetic locus of enteropathogenic Escherichia coli necessary for the production of attaching and effacing lesions on tissue culture cells. Proc. Natl. Acad. Sci. USA 877839-7843. [PMC free article] [PubMed]
21. Jin, Q., Z. Yuan, J. Xu, Y. Wang, Y. Shen, W. Lu, J. Wang, H. Liu, J. Yang, F. Yang, X. Zhang, J. Zhang, G. Yang, H. Wu, D. Qu, J. Dong, L. Sun, Y. Xue, A. Zhao, Y. Gao, J. Zhu, B. Kan, K. Ding, S. Chen, H. Cheng, Z. Yao, B. He, R. Chen, D. Ma, B. Qiang, Y. Wen, Y. Hou, and J. Yu. 2002. Genome sequence of Shigella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157. Nucleic Acids Res. 304432-4441. [PMC free article] [PubMed]
22. Jores, J., S. Wagner, L. Rumer, J. Eichberg, C. Laturnus, P. Kirsch, P. Schierack, H. Tschape, and L. H. Wieler. 2005. Description of a 111-kb pathogenicity island (PAI) encoding various virulence features in the enterohemorrhagic E. coli (EHEC) strain RW1374 (O103:H2) and detection of a similar PAI in other EHEC strains of serotype 0103:H2. Int. J. Med. Microbiol. 294417-425. [PubMed]
23. Karmali, M. A. 1989. Infection by verocytotoxin-producing Escherichia coli. Clin. Microbiol. Rev. 215-38. [PMC free article] [PubMed]
24. Karmali, M. A., M. Mascarenhas, S. Shen, K. Ziebell, S. Johnson, R. Reid-Smith, J. Isaac-Renton, C. Clark, K. Rahn, and J. B. Kaper. 2003. Association of genomic O island 122 of Escherichia coli EDL 933 with verocytotoxin-producing Escherichia coli seropathotypes that are linked to epidemic and/or serious disease. J. Clin. Microbiol. 414930-4940. [PMC free article] [PubMed]
25. Karmali, M. A., M. Petric, C. Lim, P. C. Fleming, G. S. Arbus, and H. Lior. 1985. The association between idiopathic hemolytic uremic syndrome and infection by verotoxin-producing Escherichia coli. J. Infect. Dis. 151775-782. [PubMed]
26. Kelly, M., E. Hart, R. Mundy, O. Marches, S. Wiles, L. Badea, S. Luck, M. Tauschek, G. Frankel, R. M. Robins-Browne, and E. L. Hartland. 2006. Essential role of the type III secretion system effector NleB in colonization of mice by Citrobacter rodentium. Infect. Immun. 742328-2337. [PMC free article] [PubMed]
27. Kenny, B., R. DeVinney, M. Stein, D. J. Reinscheid, E. A. Frey, and B. B. Finlay. 1997. Enteropathogenic E. coli (EPEC) transfers its receptor for intimate adherence into mammalian cells. Cell 91511-520. [PubMed]
28. Kumar, S., K. Tamura, and M. Nei. 2004. MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinform. 5150-163. [PubMed]
29. Maiden, M. C., J. A. Bygraves, E. Feil, G. Morelli, J. E. Russell, R. Urwin, Q. Zhang, J. Zhou, K. Zurth, D. A. Caugant, I. M. Feavers, M. Achtman, and B. G. Spratt. 1998. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. USA 953140-3145. [PMC free article] [PubMed]
30. Miller, S. I. 1991. PhoP/PhoQ: macrophage-specific modulators of Salmonella virulence? Mol. Microbiol. 52073-2078. [PubMed]
31. Miller, S. I., W. S. Pulkkinen, M. E. Selsted, and J. J. Mekalanos. 1990. Characterization of defensin resistance phenotypes associated with mutations in the phoP virulence regulon of Salmonella typhimurium. Infect. Immun. 583706-3710. [PMC free article] [PubMed]
32. Miller, V. L., K. B. Beer, G. Heusipp, B. M. Young, and M. R. Wachtel. 2001. Identification of regions of Ail required for the invasion and serum resistance phenotypes. Mol. Microbiol. 411053-1062 [PubMed]
33. Morabito, S., R. Tozzoli, E. Oswald, and A. Caprioli. 2003. A mosaic pathogenicity island made up of the locus of enterocyte effacement and a pathogenicity island of Escherichia coli O157:H7 is frequently present in attaching and effacing E. coli. Infect. Immun. 713343-3348. [PMC free article] [PubMed]
34. Nataro, J. P., and J. B. Kaper. 1998. Diarrheagenic Escherichia coli. Clin. Microbiol. Rev. 11142-201. [PMC free article] [PubMed]
35. Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3418-426. [PubMed]
36. Nishio, M., N. Okada, T. Miki, T. Haneda, and H. Danbara. 2005. Identification of the outer-membrane protein PagC required for the serum resistance phenotype in Salmonella enterica serovar Choleraesuis. Microbiology 151863-873. [PubMed]
37. Ochman, H., and A. C. Wilson. 1987. Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 2674-86. [PubMed]
38. Perna, N. T., G. Plunkett III, V. Burland, B. Mau, J. D. Glasner, D. J. Rose, G. F. Mayhew, P. S. Evans, J. Gregor, H. A. Kirkpatrick, G. Posfai, J. Hackett, S. Klink, A. Boutin, Y. Shao, L. Miller, E. J. Grotbeck, N. W. Davis, A. Lim, E. T. Dimalanta, K. D. Potamousis, J. Apodaca, T. S. Anantharaman, J. Lin, G. Yen, D. C. Schwartz, R. A. Welch, and F. R. Blattner. 2001. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409529-533. [PubMed]
39. Preston, G. M., B. Haubold, and P. B. Rainey. 1998. Bacterial genomics and adaptation to life on plants: implications for the evolution of pathogenicity and symbiosis. Curr. Opin. Microbiol. 1589-597. [PubMed]
40. Reid, S. D., C. J. Herbelin, A. C. Bumbaugh, R. K. Selander, and T. S. Whittam. 2000. Parallel evolution of virulence in pathogenic Escherichia coli. Nature 40664-67. [PubMed]
41. Shen, S., M. Mascarenhas, K. Rahn, J. B. Kaper, and M. A. Karmali. 2004. Evidence for a hybrid genomic island in verocytotoxin-producing Escherichia coli CL3 (serotype O113:H21) containing segments of EDL933 (serotype O157:H7) O islands 122 and 48. Infect. Immun. 721496-1503. [PMC free article] [PubMed]
42. Stackebrandt, E., W. Frederiksen, G. M. Garrity, P. A. Grimont, P. Kampfer, M. C. Maiden, X. Nesme, R. Rossello-Mora, J. Swings, H. G. Truper, L. Vauterin, A. C. Ward, and W. B. Whitman. 2002. Report of the Ad Hoc Committee for the Re-Evaluation of the Species Definition in Bacteriology. Int. J. Syst. Evol. Microbiol. 521043-1047. [PubMed]
43. Tan, H. M. 1999. Bacterial catabolic transposons. Appl. Microbiol. Biotechnol. 511-12. [PubMed]
44. Tauschek, M., R. A. Strugnell, and R. M. Robins-Browne. 2002. Characterization and evidence of mobilization of the LEE pathogenicity island of rabbit-specific strains of enteropathogenic Escherichia coli. Mol. Microbiol. 441533-1550. [PubMed]
45. Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X Windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 254876-4882. [PMC free article] [PubMed]
46. Wickham, M. E., C. Lupp, M. Mascarenhas, A. Vazquez, B. K. Coombes, N. F. Brown, B. A. Coburn, W. Deng, J. L. Puente, M. A. Karmali, and B. B. Finlay. 2006. Bacterial genetic determinants of non-O157 STEC outbreaks and hemolytic-uremic syndrome after infection. J. Infect. Dis. 194819-827. [PubMed]
47. Wieler, L. H., T. K. McDaniel, T. S. Whittam, and J. B. Kaper. 1997. Insertion site of the locus of enterocyte effacement in enteropathogenic and enterohemorrhagic Escherichia coli differs in relation to the clonal phylogeny of the strains. FEMS Microbiol. Lett. 15649-53. [PubMed]
48. World Health Organization. 1999. Zoonotic non-O157 Shiga toxin-producing Escherichia coli (STEC). Report of a WHO scientific working group meeting. World Health Organization, Geneva, Switzerland.
49. Zeigler, D. R. 2003. Gene sequences useful for predicting relatedness of whole genomes in bacteria. Int. J. Syst. Evol. Microbiol. 531893-1900. [PubMed]
50. Zhang, W., W. Qi, T. J. Albert, A. S. Motiwala, D. Alland, E. K. Hyytia-Trees, E. M. Ribot, P. I. Fields, T. S. Whittam, and B. Swaminathan. 2006. Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms. Genome Res. 16757-767. [PMC free article] [PubMed]
51. Ziebell, K., P. Konczy, I. Yong, S. Frost, M. Mascarenhas, A. M. Kropinski, T. S. Whittam, S. C. Read, and M. A. Karmali. 2008. Applicability of phylogenetic methods for characterizing the public health significance of verocytotoxin-producing Escherichia coli strains. Appl. Environ. Microbiol. 741671-1675. [PMC free article] [PubMed]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...