![]() | ![]() |
Formats:
|
||||||||||||||||||||||
Copyright © 2009 Abu-Ali et al; licensee BioMed Central Ltd. Genomic diversity of pathogenic Escherichia coli of the EHEC 2 clonal complex 1Microbial Evolution Laboratory, National Food Safety & Toxicology Center, 165 Food Safety & Toxicology Building, Michigan State University, East Lansing, Michigan 48824, USA 2Division of Molecular Biology, Center for Food Safety and Applied Nutrition, U.S. Food and Drug Administration, Laurel, Maryland 20708, USA 3Biosynth AG, Rietlistrasse 4, 9422 Staad, Switzerland 4Functional Genomics Center Zurich, University of Zurich and Swiss Federal Institute of Technology Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland Corresponding author.Galeb S Abu-Ali: abualiga/at/cvm.msu.edu; David W Lacher: David.Lacher/at/fda.hhs.gov; Lukas M Wick: lukas.wick/at/biosynth.ch; Weihong Qi: weihong.qi/at/fgcz.ethz.ch; Thomas S Whittam: abualiga/at/cvm.msu.edu Received December 18, 2008; Accepted July 3, 2009. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background Evolutionary analyses of enterohemorrhagic Escherichia coli (EHEC) have identified two distantly related clonal groups: EHEC 1, including serotype O157:H7 and its inferred ancestor O55:H7; and EHEC 2, comprised of several serogroups (O26, O111, O118, etc.). These two clonal groups differ in their virulence and global distribution. Although several fully annotated genomic sequences exist for strains of serotype O157:H7, much less is known about the genomic composition of EHEC 2. In this study, we analyzed a set of 24 clinical EHEC 2 strains representing serotypes O26:H11, O111:H8/H11, O118:H16, O153:H11 and O15:H11 from humans and animals by comparative genomic hybridization (CGH) on an oligoarray based on the O157:H7 Sakai genome. Results Backbone genes, defined as genes shared by Sakai and K-12, were highly conserved in EHEC 2. The proportion of Sakai phage genes in EHEC 2 was substantially greater than that of Sakai-specific bacterial (non-phage) genes. This proportion was inverted in O55:H7, reiterating that a subset of Sakai bacterial genes is specific to EHEC 1. Split decomposition analysis of gene content revealed that O111:H8 was more genetically uniform and distinct from other EHEC 2 strains, with respect to the Sakai O157:H7 gene distribution. Serotype O26:H11 was the most heterogeneous EHEC 2 subpopulation, comprised of strains with the highest as well as the lowest levels of Sakai gene content conservation. Of the 979 parsimoniously informative genes, 15% were found to be compatible and their distribution in EHEC 2 clustered O111:H8 and O118:H16 strains by serotype. CGH data suggested divergence of the LEE island from the LEE1 to the LEE4 operon, and also between animal and human isolates irrespective of serotype. No correlation was found between gene contents and geographic locations of EHEC 2 strains. Conclusion The gene content variation of phage-related genes in EHEC 2 strains supports the hypothesis that extensive modular shuffling of mobile DNA elements has occurred among EHEC strains. These results suggest that EHEC 2 is a multiform pathogenic clonal complex, characterized by substantial intra-serotype genetic variation. The heterogeneous distribution of mobile elements has impacted the diversification of O26:H11 more than other EHEC 2 serotypes. Background Enterohemorrhagic Escherichia coli (EHEC), the intersection of Shiga toxin producing E. coli (STEC) and attaching and effacing E. coli (AEEC), comprise a group of pathogenic E. coli that cause a variety of human and animal illnesses ranging from diarrhea to hemorrhagic colitis (HC), and the multifactorial hemolytic uremic syndrome (HUS) [1]. Intimate adherence to the intestinal epithelium resulting in characteristic attaching and effacing (A/E) lesions, and the destruction of capillary walls via production of phage borne Shiga toxins (Stx 1, 2, and variants) are hallmarks of EHEC pathogenesis. A/E lesion formation is dependent upon a type three secretion system (TTSS), which is encoded on the laterally acquired locus of enterocyte effacement (LEE) [2]. E. coli O157:H7 is the dominant EHEC serotype in the United States, Argentina, Great Britain, and Japan [3,4]. However, multiple reports have shown that other EHEC, including serogroups O26, O111, O103, and O118, frequently cause sporadic cases of human illness [5-12], and have been implicated in numerous outbreaks [13-17]. In Australia and parts of Europe, infections with serogroups O26 and O111 are prevailing while the incidence of O157:H7-associated disease appears to be declining [18-21]. In contrast to E. coli O157:H7, EHEC serogroups O26, O111, O118, O103, and O5 are commonly linked to outbreaks and sporadic cases of calf diarrhea (scours) and HC [22-28], which has been validated from experimental infections in calves [29-32]. In Germany and Belgium, for example, EHEC O118 is the most prevalent type of STEC associated with diarrhea in calves [33], with evidence for zoonotic transmission [8,34]. Phylogenetic analyses of conserved metabolic genes have revealed some of the basis for the variation among EHEC strains. Multilocus enzyme electrophoresis [35] and partial sequencing of 13 housekeeping genes [36] classified EHEC into two distantly related clonal groups: EHEC 1 includes serotype O157:H7 and its inferred ancestor O55:H7, whereas EHEC 2 includes numerous serogroups (e.g., O26, O111, O118). The key virulence factors shared between EHEC 1 and EHEC 2 clonal complexes were postulated to have been introduced through multiple and parallel acquisitions of mobile elements [37]. A comparison of E. coli O157:H7 genomes has also revealed the extent and significant impact of horizontal transfer on the evolution of virulence [38,39]. Furthermore, array comparative genomic hybridizations (CGH) have shown that the divergence in gene content among closely related O157 strains is ~140 times greater than the divergence at the nucleotide sequence level [40]. Although recent evidence indicates the emergence of highly virulent lineages among non-O157 EHEC, notably the O26 serogroup [19,41], little is known about the gene content, genetic diversity and evolution of virulence in members of the EHEC 2 group. The function of ancillary virulence determinants is somewhat characterized in O157:H7 [2,42], however, the relevance as well as the distribution of these factors in EHEC 2 is not clear. To systematically investigate the gene content variations within the EHEC 2 clonal group. we analyzed a set of 24 clinical EHEC 2 strains representing serotypes O26:H11, O111:H8/H11, O118:H16, O153:H11 and O15:H11 from humans and animals using array-based CGH. Because there are no EHEC 2 genome sequences available, a multi-genome spotted oligoarray containing probes for 5,978 ORFs from O157:H7 Sakai, O157:H7 EDL933, and K-12 MG1655 was used to examine the distribution of these E. coli genes in our collection of EHEC 2 strains. The findings of this study shed light on the diversification of horizontally acquired elements in a group of pathogens that represent recent evolutionary branches of EHEC clonal groups. Results Sequence types (STs) and stx profiles of EHEC 2 strains Phylogenetic analyses of multi locus sequence typing (MLST) data grouped the 24 EHEC 2 strains (Table 1) into four STs. The most common was ST 106, which was found in 20 strains, while the remaining three STs each differed from ST 106 by a single nucleotide polymorphism (SNP) in almost 4,000 bp of the concatenated MLST sequence. MLST data revealed a lack of nucleotide sequence diversity in house keeping genes among these EHEC 2 strains. The neighbor-joining phylogeny based on concatenated MLST allelic sequences grouped the EHEC 2 strains into a distinct cluster, with 100% bootstrap support, which was more closely related to the EPEC 2 group (100% bootstrap support) than to members of EHEC 1 (Figure (Figure1).1
Gene content of EHEC 2 strains Binary classification of genes as present or divergent/absent, inferred by GACK analyses of the CGH data, was used to determine the gene content of all 24 EHEC 2 strains (Table 2) and of each individual strain (Table 3). Because all CGH experiments were performed with Sakai as the reference strain, our analyses focused on probes targeting genes present in the Sakai genome. The oligo probes were classified to represent backbone genes (shared by Sakai and K-12), and Sakai-specific genes (note that the term "Sakai-specific" is used here only in comparison to K-12). The Sakai-specific genes were further classified in Sakai phage genes (phage-related genes present in Sakai but absent in K-12) and Sakai bacterial genes (non-phage-related genes present in Sakai but absent in K-12) [38]. Of the 3,696 backbone genes, 80.9% were shared by all EHEC 2 strains, whereas only 5.8% of the Sakai phage genes (n = 814) and 6.5% of the Sakai bacterial genes (n = 434) were found in every tested EHEC 2 strain. While 84.7% of the Sakai phage genes were found in at least one of the 24 EHEC 2 strains, a whole 53% of the Sakai bacterial genes were not found in any of the these strains (Table 2).
In each individual EHEC 2 strain, approximately 95% of the 3,696 backbone genes were found (Table 3, Figure Figure2),2
Identification of potential EHEC-specific genes From the 1,248 Sakai-specific genes represented on the microarray, 152 (12.2%) were conserved in 23 of the 24 EHEC 2 strains; 102 of these were phage-related. Sixty-four genes encode hypothetical proteins of unknown function, and the remainder consisted mostly of genes responsible for various prophage and other mobile element functions. Nucleotide sequences of these 152 genes were compared against five non-EHEC pathogenic E. coli (536, APEC O1, B171, CFT073, UTI89) and six Shigella (Sf2a 2457T, Sf2a 301, Sf5 8401, Ss046, Sb227, Sd197) published genomes, using BLAST. With a minimum of 80% nucleotide sequence identity in a minimum of 80% query coverage as the cutoff value to identify conserved genes, 26 of the 152 genes were not found in any of the 11 queried non-EHEC genome sequences. The 26 gene sequences were then "BLASTed" against the entire GenBank database with the same cutoff value. Only three of these 26 genes were not found in any other organisms and therefore could be considered as specific to EHEC strains: ECs1561 (Sakai prophage (Sp) 6); ECs1763, and ECs1822 (Sp 9). All three genes encode hypothetical proteins of unknown function. Genomic relatedness of EHEC 2 strains We used the split decomposition method to infer the strain relatedness based on gene content data. We first analyzed all the 4,800 genes whose probe intensities were higher than those for negative controls. As expected, the analysis showed a network like phylogeny (Figure (Figure3),3
Among the 4,800 genes whose probe intensities were higher than those for negative controls, 70.8% were found to be either present or divergent/absent in all 24 strains, and therefore, phylogenetically uninformative. Compatibility analysis of the 979 parsimoniously informative (PI) genes identified 147 PI genes to be phylogenetically compatible with each other, but not compatible with the rest of the PI genes (the distribution of these genes is shown in Additional file 1). For the second split decomposition analysis, these 147 genes were combined with 421 singleton genes (genes found present or divergent/absent in only one of the 24 EHEC 2 strains). Singletons were added to generate terminal edges of the network and to help distinguish strain-specific changes. The analysis with this set of genes showed a more tree like phylogeny with a better separation of EHEC 2 strains (Figure (Figure4).4
Prophages To visualize gene content of the 814 Sakai phage genes within the EHEC 2 clonal group, we classified these genes by Sakai phage groups (Sakai prophages Sp1–18, and prophage-like elements SpLE1–6) and sorted the genes in each group by chromosomal order (based on ECs numbers). This classification does not necessarily infer that these genes are present in EHEC 2 within the same phage or order as they are in Sakai, but simply allows an assessment of gene content variation of laterally acquired genes known to be linked in the Sakai chromosome. Dendrograms based on pairwise comparison of gene content were used to identify EHEC 2 strains with similar gene content (Figure (Figure55
Stx converting prophages The CGH data confirmed the stx1/stx2 profile of the EHEC 2 strains determined by PCR. In Sp15 (stx1-prophage), a block of genes at the beginning of the phage (ECs2940–2952) was conserved in most strains (Figure (Figure5).5 Strains positive for the stx2 gene, mostly representing serotype O111:H8, had more Sp5 (stx2-phage) genes. Integrase and excisionase genes (ECs1160 and ECs1161), and the block of genes at the beginning of the phage, ECs1160–1187, were missing from most strains. The rest of Sp5 genes, which encode replication proteins O and P, NinE and NinG, Shiga toxin 2, antirepressor proteins, antitermination protein Q, outer membrane precursor proteins, terminases, tail proteins, and a number of hypothetical proteins, were present in five of the six O111:H8 strains as well as in the O26:H11 strain containing both stx1 and stx2 (Figure (Figure55 Locus of enterocyte effacement (LEE) island Of the 41 genes in the Sakai LEE island that are located on SpLE4, all except escU were present in the O55:H7 strain. This includes genes that were categorized as present after the initial GACK cutoff was relaxed by 20%. Since dye-swap genomic microarrays represent competitive hybridizations between two populations of DNA, there were instances when a small difference in the nucleotide sequence of the tested strain resulted in weaker probe signal intensity. For example, both of the two known SNPs present between the variable regions of γ intimin in O55:H7 and O157:H7 [45] are located in the middle region of the 70-mer probe for eae. Hence the signal intensity for this gene was just below the cutoff (gray shading in Figure Figure5).5
Other phage gene groups Most genes from SpLE1, which encodes the tellurite resistance and adherence island (TAI), were divergent/absent from two EHEC 2 strains and from the O55:H7 representative, but present in the rest of the EHEC 2 strains (Figure (Figure5).5 Non-LEE encoded effectors The gene content of non-LEE encoded effectors, which are predicted to be secreted by the LEE-encoded TTSS [42] in EHEC 2, varied from totally divergent/absent to present in every strain (Additional file 3). Genes espY1, nleD, espX2, espY4, espL3', espX3', espL4, and nleB2-1 were divergent/absent from EHEC 2, whereas a set of 15 genes (espX1, espX5, espX6, espY3, espK, nleA, nleE, nleG, nleG2-2, nleG6-1, espM1, espM2, espR1, espL1, and espW) were present in at least 22 EHEC 2 strains. The nleG7 gene, which was recently found to be conserved in a group of non-O157 EHEC strains [46], was also divergent/absent in all EHEC 2 examined in this study. Discussion Comparative analysis of genomes from 17 commensal and pathogenic E. coli strains has revealed a diverse species 'pan-genome', while the E. coli 'core conserved' genome was calculated to be about one-half of the genome of a given E. coli isolate [47]. Although EHEC utilize similar virulence mechanisms, this pathotype is comprised of phylogenetically distinct lineages that vary in their ability to cause disease in both humans and animals. Clearly, the genome of a single strain cannot reflect how the genomic diversity among EHEC strains influences pathogenesis of the EHEC population. Because no strains from the EHEC 2 clonal group have been sequenced, the genetic variability of 24 EHEC 2 strains were examined in relation to the distribution of genes from O157:H7 Sakai, which belongs to the EHEC 1 clonal group. The Sakai genome was used in this study, as its annotation is suggested to include more strain-specific genes compared to EDL933 [47]. Genes specific to the EHEC 2 group have yet to be described. Some genes shared with Sakai might have been missed in our study, if the gene sequence had diverged to a point where the 70-mer oligonucleotide probes and the stringency of competitive hybridization preclude detection. Although this study allowed screening of known genes only, the gene content data still offered new insight on strain relatedness and the distribution and subsequent diversification of mobile elements within the EHEC 2 clonal group. The CGH data presented here indicate that there are two distinct trends, which reflect the bacterial (vertical) and phage (lateral) origin of genes, impacting the genomic divergence of EHEC 2. Virtually the entire set of backbone genes was present within the EHEC 2 clonal group (Tables 2 and 3). CGH inferences pertaining to the distribution of backbone genes can vary depending on array type, sample size, and strain diversity [46]. For example, Anjum et al. have proposed that the O26 serogroup exhibits greater genetic homogeneity than was observed in our study [48]; however, the microarray platform used in that study was limited to the genome of K-12 MG1655. Despite these differences, the degree of conservation among backbone genes in this CGH investigation was similar in previous studies [46,49,50]. The distribution of Sakai-specific genes in EHEC 2 was, not surprisingly, noticeably lower than that of the backbone, which restates established findings about intraspecies genomic variability [40,51,52]. The conservation of Sakai phage genes was, however, found to be more than 2-fold higher when compared to Sakai bacterial genes (Figure (Figure22 The increased presence of Sakai phage genes in the EHEC 2 group compared to Sakai bacterial genes reveals independent acquisition and exchange of similar mobile elements. For example, of the 152 Sakai-specific genes present in EHEC 2, only 26 genes were not found in 11 completed non-EHEC E. coli and Shigella spp. genomes. About one-half of the 26 "EHEC only" genes were found in stx1-encoding phages BP-4795 and CP-1639 from STEC O84:H11 and O111:H-, respectively [54,55]. Sakai genes identified by BLASTN as present on BP-4795 are disseminated on phages Sp6, 9, 10, and 12, which is in agreement with the evidence for recombination between phages [56]. Although the number of phage genes shared by all tested strains was low, the percentage of those that were VAP was high (Table 2), which may reflect sequence heterogeneity in prophage genomes with similar modular structures [54,56,57], and not true absence of genes. Phylogenetic network analysis implied a serotype-specific uniformity of O111:H8 strains, unlike other EHEC 2 strains (Figure (Figure3),3 Based on the distinguishing distribution of Sakai genes (Figures (Figures33 A proportion of the EHEC 2 hybridization data (15% of the PI genes) were identified as genes that are phylogenetically compatible with each other, i.e., having no homoplasy. Although this represents a small number of genes, it is remarkable that the distribution pattern grouped EHEC 2 O111:H8 and O118:H16 strains by serotype (Figure (Figure4).4 The heterogeneity of Stx phages has been demonstrated [57,63], even within the O157:H7 lineage itself [64,65], so it is not unexpected to find such variation between different EHEC 2 strains. In addition, Ogura et al. propose that Stx phages have alternative integration sites in EHEC 2 [46]; this may explain our lack of detection of integrase genes, as integration site specificity is dependent on the alignment of the phage integrase with the attachment sequence in the bacterial chromosome [66]. Strains that were stx negative in our study were, nevertheless, found to carry genes from the Sp15 and Sp5 phages, which is a common effect of frequent modular shuffling of sequences between phages of related enteric hosts [56,67,68]. The significance of the unique conservation patterns of Sp10 and Sp18 phage genes is not clear. Sp10 is perhaps more conserved as it harbors non-LEE effector genes [42], all 3 of which were detected in at least 22 out of 24 EHEC 2 strains. Absence of the entire Sp18 was also detected among O157:H7 strains [65], one of which belongs to a hyper-virulent lineage of the O157:H7 population [69]. Incongruent divergence of LEE operons has been previously suggested. Studies indicate that this island is a dynamic region [70], and that different selective pressures act on different parts of the LEE [71]. The sequence diversity of the LEE, both at the nucleotide and amino acid level, increases along the length of the island from the LEE1 to the LEE4 operon [71,72]. A comparable trend can be observed in the CGH data presented here, as there was greater conservation of the content of genes that encode the secretion apparatus (LEE1–3). However, differences in the content of O157:H7 Sakai LEE genes between human and animal EHEC 2 strains of the same serotype (Figure (Figure55 Muniesa et al. suggest that the LEE genes associated with serogroup O26 are present more commonly in STEC than the LEE genes associated with EHEC O157:H7 or EPEC O127:H6 [76]. Yet, there is no clear evidence to support the hypothesis that LEE divergence within a lineage results from positive adaptive pressure in different host species. In fact, when several LEE genes from strain RDEC-1 were compared to those from other AEEC, the variation appeared to be associated with evolutionary lineage and not host specificity [77]. Even so, given the heterogeneous diversification of this island and the recent inference about host-specific expression of espA and eae in O157:H7 [78], it would be interesting to compare complete LEE sequences from a larger sample of EHEC 2 strains of human and animal origin. Conclusion Here, we present an assessment of the gene content of a set of EHEC 2 clinical strains of animal and human origin, isolated from the USA and Europe. The small subset of phylogenetically compatible genes represent potential markers that will aid in the investigation of the relatedness and cladogenesis of the EHEC 2 clonal group. In this study, serotype O26:H11, the most frequent EHEC 2 serotype associated with overt disease, represented the most diverse EHEC 2 population. Compared to the more homogeneous O111:H8 strains, O26:H11 strains may have an increased propensity to laterally exchange DNA, which may ultimately give rise to hyper-virulent lineages within EHEC 2 O26:H11. Furthermore, the identification of several EHEC-specific genes could potentially be used as novel genetic markers to identify strains belonging to this pathotype. Methods Bacterial strains and DNA isolation Since genome sequences for tested strains are not available, two-color hybridizations between sequenced strains of E. coli O157:H7 RIMD 0509952 (Sakai) [38] and K-12 MG1655 [79] were used as references. A total of 24 EHEC 2 strains including serotypes O26:H11 (n = 8), O111:H8 (n = 6), O111:H11 (n = 2), O118:H16 (n = 6), O153:H- (n = 1), and O15:H11 (n = 1), originally isolated from human and animal cases of STEC-associated disease, were used in this study and were selected based on the serotype and source (Table 1) [6,33,80-90]. The study also included an EHEC 1 O55:H7 strain, isolated from a human diarrhea case. Bacterial DNA was prepared from overnight LB cultures grown at 37°C using the Puregene genomic DNA isolation kit (Gentra Systems, Minneapolis, MN). Multilocus sequence typing (MLST) and Shiga toxin (Stx) genes The detailed MLST protocol and multiplex PCR conditions for characterizing the Stx genes (stx1/stx2) can be found at the STEC Reference Center website http://www.shigatox.net. Briefly, MLST was performed on seven conserved housekeeping genes (aspC, clpX, fadD, icdA, lysP, mdh, and uidA), and sequence type (ST) assignments were made based on phylogenetic analyses of the concatenated sequences. Oligonucleotide arrays The Qiagen (Valencia, Calif.) spotted multi-genome arrays containing probes specific for 5,978 ORFs from E. coli K-12 MG1655, O157:H7 Sakai and EDL933 were utilized. Of these probes, a total of 5,943 were 70-mer oligonucleotides and 35 ranged from 41–69 bp. The probes were printed in duplicate on UltraGaps glass slides (Corning Inc., NY) at the Research Technology Support Facility at Michigan State University. The array also contained 384 spots representing 12 randomized negative control 70-mer probes. All probes were assigned ORF designations (b- = MG1655, ECs- = Sakai, or Z- = EDL933 numbers) or intergenic region labels based on the RefSeq database available on the National Center for Biotechnology Information (NCBI) website [91]. In silico analysis of microarray probe specificity To verify the probes with the up-to-date genome annotations, we compared all 5,990 probe sequences against the three E. coli genomes (MG1655, Sakai, and EDL933) by BLASTN available on NCBI, and recorded the two highest hits for every probe (top hit and second hit) for each genome. A probe was considered to be specific for a target when the top hit demonstrated ≥ 80% identity to the probe sequence stretch in the strain. Probes with nonspecific hybridization and multiple target hybridizations within MG1655 or Sakai DNA were excluded from the data analysis of MG1655 and Sakai hybridizations. These included probes that had multiple top hits with 75% overall identity or probes that had multiple top hits between 50% and 75% of overall identity with alignments containing a stretch of nucleotides with 100% identity, in which the stretch was 20% of the probe length. With respect to the MG1655 and Sakai genomes, out of 5,978 probes, 12 had no target (EDL933 specific), 731 showed nonspecific hybridization or had multiple targets, and 5,235 matched single genome targets. Of these, 3,803 targeted both genomes, with 1,002 targeting only Sakai and 430 targeting only K-12. DNA labeling and microarray hybridization Genomic DNA was sheared into 500 to 5,000 bp fragments in a cup sonicator (Heat Systems Ultrasonics W-225, 20 KHz, 200 W) and 250 ng of sheared DNA was labeled with aminoallyl-dUTP (Sigma, St. Louis, Mo.) using the Invitrogen (Carlsbad, Calif.) DNA labeling system, as previously described [40]. Equal amounts of DNA from Sakai and test strains were suspended and combined in a final volume of 44 μL of SlydeHyb Buffer #1 (Ambion, Inc., Austin, TX). Qiagen E. coli spotted oligoarrays were hybridized and washed according to the manufacturer's instructions for hybridization using coverslips. Test strains were hybridized twice with Sakai as a reference: once with the Cy5 labeled test strain and Cy3 labeled Sakai and once with the Cy3 labeled test strain and Cy5 labeled Sakai to correct for dye incorporation bias. Data collection and analyses Arrays were scanned with the Genepix 4000B array scanner (Axon Instruments, Union City, Calif.) and probe intensities (median pixel intensities) were retrieved using Genepix 6.0 (Axon Instruments). Data quality was assessed by viewing plots of M versus A [M = log2 (test/reference); A = log2 (test × reference)] and by checking for spatial effects with Genepix 6.0 and GeneTraffic (Iobion, La Jolla, Calif.) as described previously [40]. Because genome sequences of tested strains were not available, microarray data were not normalized to avoid biasing the gene content of tested strains. Instead, microarray images showing spatial bias were discarded and hybridizations were repeated until control parameters were appropriate. Duplicate probes for each gene were averaged prior to analyses. Probes with median pixel intensities higher than the median of the randomized negative controls were analyzed as the distribution of the two-color signal ratios using the "GACK" program [92]. Analysis of the log2 (test strain/reference strain) distribution (GACK1) as well as of the reciprocal ratio, log2 (reference strain/test strain) (GACK2), were performed for Sakai versus MG1655 hybridizations to determine a cutoff. Genes with a GACK1 value of ≥ 0.1 were classified as present, whereas genes with a GACK1 value of < 0.1 were classified as divergent/absent. At this cutoff, maximum sensitivity (98.8%) and specificity (96%) were achieved for the MG1655/Sakai dye-swap hybridizations, and therefore, this cutoff was used to interpret the data from Sakai versus EHEC 2 hybridizations. The term 'present' is used to indicate that a gene was detected by CGH, and does not necessarily imply that the whole gene is conserved or functional; likewise, the term 'divergent/absent' indicates that a gene was not detected by CGH. Phylogenetic analyses Strains were assigned to clonal groups based on STs and bootstrap analyses as described previously [36,93]. A neighbor-joining tree of the concatenated MLST sequences was constructed using the Kimura 2-parameter distance method with 1000 bootstrap replications in MEGA 3.1 [94]. The tree includes other enteropathogenic E. coli (EPEC) and EHEC STs as well as the lab-derived K-12 (ST173) and the uropathogenic E. coli CFT073 (ST27) for comparison; an E. albertii strain was used as the outgroup. For phylogenetic analyses of the microarray data, a total of 144 genes (from all array hybridizations) with probe intensities below those of negative controls were excluded from the set of 4,944 genes. Neighbor-net phylogenies highlighting the distribution of Sakai genes in EHEC 2 strains, for which the presence or absence of genes was coded as 0 (divergent/absent) or 1 (present), were constructed using the uncorrected p distance in Splitstree 4.3 [95]. The number of Sakai genes whose distribution in EHEC 2 was parsimoniously informative were determined in MEGA 3.1 [94], and the set of Sakai genes in EHEC 2 whose distribution was compatible with a single phylogeny was identified using the clique module of PHYLIP [96]. Competing interests The authors declare that they have no competing interests. Authors' contributions GSA designed the study, collected and analyzed CGH data, and drafted the manuscript. DWL performed intimin and stx typing, phylogenetic analyses and helped to draft the manuscript. LMW participated in design and analysis of CGH data and WQ performed in silico verification of microarray probe specificity. TSW participated in the design and coordination of the study, conducted the phylogenetic analyses, and helped draft the manuscript. The first four authors read and approved the final manuscript; TSW had approved an earlier draft of the manuscript (deceased, December 5, 2008). Additional file 1 Distribution of phylogenetically compatible genes in EHEC 2, determined with the clique program in the PHYLIP package. Conserved genes have a value of 1 and divergent/absent have a value of 0. Click here for file(57K, xls) Additional file 2 Genes in EHEC 2 whose distribution was used to generate colormaps in Figure Figure5.5 Click here for file(90K, xls) Additional file 3 Distribution of 49 non-LEE effector genes in EHEC 2. Conserved genes have a value of 1, divergent/absent genes have a value of 0 and genes that have a value of 0.5 were inferred as conserved after the GACK cutoff was relaxed by 20%. Click here for file(25K, xls) Additional file 4 Distribution of K-12-specific genes in EHEC 2. Conserved genes have a value of 1 and divergent/absent genes have a value of 0. Click here for file(311K, xls) Acknowledgements The authors thank Shannon Manning, James Riordan, Sivapriya Kailasan Vanaja, Linda Mansfield, Martha Mulks, and Jillian Tietjen for critically reviewing earlier versions of the manuscript; Lindsey Ouellette for technical assistance with MLST; and those investigators who supplied strains for use in the study. This project was funded in part by the MSU foundation and the NIAID, NIH, DHHS, under NIH research contract N01-AI-30058 (TSW), which supports the STEC Center. The authors wish to dedicate this work to the memory of Dr. Thomas S. Whittam. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||
Nat Rev Microbiol. 2004 Feb; 2(2):123-40.
[Nat Rev Microbiol. 2004]Int J Infect Dis. 2003 Mar; 7(1):42-5.
[Int J Infect Dis. 2003]Epidemiol Infect. 2006 Aug; 134(4):724-8.
[Epidemiol Infect. 2006]MMWR Morb Mortal Wkly Rep. 2000 Apr 21; 49(15):321-4.
[MMWR Morb Mortal Wkly Rep. 2000]J Infect Dis. 1994 Jan; 169(1):208-11.
[J Infect Dis. 1994]Arch Dis Child. 2001 Aug; 85(2):125-31.
[Arch Dis Child. 2001]Infect Immun. 2002 Dec; 70(12):6853-9.
[Infect Immun. 2002]Nature. 2000 Jul 6; 406(6791):64-7.
[Nature. 2000]DNA Res. 2001 Feb 28; 8(1):11-22.
[DNA Res. 2001]Nature. 2001 Jan 25; 409(6819):529-33.
[Nature. 2001]J Bacteriol. 2005 Mar; 187(5):1783-91.
[J Bacteriol. 2005]Nat Rev Microbiol. 2004 Feb; 2(2):123-40.
[Nat Rev Microbiol. 2004]Proc Natl Acad Sci U S A. 2006 Oct 3; 103(40):14941-6.
[Proc Natl Acad Sci U S A. 2006]DNA Res. 2001 Feb 28; 8(1):11-22.
[DNA Res. 2001]Genetics. 2006 Apr; 172(4):2665-81.
[Genetics. 2006]Infect Immun. 1994 Aug; 62(8):3282-8.
[Infect Immun. 1994]Mol Biol Evol. 1999 Jan; 16(1):12-22.
[Mol Biol Evol. 1999]Proc Natl Acad Sci U S A. 2006 Oct 3; 103(40):14941-6.
[Proc Natl Acad Sci U S A. 2006]Genome Biol. 2007; 8(7):R138.
[Genome Biol. 2007]J Bacteriol. 2008 Oct; 190(20):6881-93.
[J Bacteriol. 2008]Genome Biol. 2007; 8(7):R138.
[Genome Biol. 2007]Infect Immun. 2003 Aug; 71(8):4674-83.
[Infect Immun. 2003]J Bacteriol. 2003 Mar; 185(6):1831-40.
[J Bacteriol. 2003]J Bacteriol. 2004 Jun; 186(12):3911-21.
[J Bacteriol. 2004]J Bacteriol. 2005 Mar; 187(5):1783-91.
[J Bacteriol. 2005]Microbiology. 2005 Mar; 151(Pt 3):941-50.
[Microbiology. 2005]J Bacteriol. 2005 Dec; 187(24):8494-8.
[J Bacteriol. 2005]Microbiol Mol Biol Rev. 2004 Sep; 68(3):560-602.
[Microbiol Mol Biol Rev. 2004]Infect Immun. 2002 Apr; 70(4):1896-908.
[Infect Immun. 2002]FEMS Microbiol Lett. 2006 Aug; 261(1):80-7.
[FEMS Microbiol Lett. 2006]Infect Immun. 1999 Nov; 67(11):5994-6001.
[Infect Immun. 1999]J Clin Microbiol. 2000 Jun; 38(6):2134-40.
[J Clin Microbiol. 2000]Science. 1994 Nov 25; 266(5189):1380-3.
[Science. 1994]Nat Rev Microbiol. 2004 Jun; 2(6):483-95.
[Nat Rev Microbiol. 2004]Infect Immun. 2002 Apr; 70(4):1896-908.
[Infect Immun. 2002]Int J Med Microbiol. 2004 Sep; 294(2-3):115-21.
[Int J Med Microbiol. 2004]Infect Immun. 2004 Dec; 72(12):7131-9.
[Infect Immun. 2004]Proc Natl Acad Sci U S A. 2002 Dec 24; 99(26):17043-8.
[Proc Natl Acad Sci U S A. 2002]Genome Biol. 2007; 8(7):R138.
[Genome Biol. 2007]Microbiology. 2001 Nov; 147(Pt 11):3149-58.
[Microbiology. 2001]Proc Natl Acad Sci U S A. 2005 Feb 1; 102(5):1542-7.
[Proc Natl Acad Sci U S A. 2005]Mol Microbiol. 1998 Dec; 30(5):911-21.
[Mol Microbiol. 1998]Nature. 2000 Jul 6; 406(6791):64-7.
[Nature. 2000]Mol Microbiol. 2002 Jun; 44(6):1533-50.
[Mol Microbiol. 2002]Infect Immun. 2006 Jul; 74(7):4190-9.
[Infect Immun. 2006]Infect Immun. 2001 Apr; 69(4):2107-15.
[Infect Immun. 2001]Infect Immun. 2006 Jul; 74(7):4142-8.
[Infect Immun. 2006]DNA Res. 2001 Feb 28; 8(1):11-22.
[DNA Res. 2001]Science. 1997 Sep 5; 277(5331):1453-62.
[Science. 1997]Emerg Infect Dis. 2000 Sep-Oct; 6(5):530-3.
[Emerg Infect Dis. 2000]J Clin Microbiol. 1998 Jun; 36(6):1604-7.
[J Clin Microbiol. 1998]J Bacteriol. 2000 Oct; 182(19):5381-90.
[J Bacteriol. 2000]J Mol Biol. 1990 Oct 5; 215(3):403-10.
[J Mol Biol. 1990]J Bacteriol. 2005 Mar; 187(5):1783-91.
[J Bacteriol. 2005]J Bacteriol. 2005 Mar; 187(5):1783-91.
[J Bacteriol. 2005]Genome Biol. 2002 Oct 29; 3(11):RESEARCH0065.
[Genome Biol. 2002]Infect Immun. 2002 Dec; 70(12):6853-9.
[Infect Immun. 2002]J Bacteriol. 2007 Jan; 189(2):342-50.
[J Bacteriol. 2007]Brief Bioinform. 2004 Jun; 5(2):150-63.
[Brief Bioinform. 2004]Mol Biol Evol. 2006 Feb; 23(2):254-67.
[Mol Biol Evol. 2006]