• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Jan 7, 2003; 100(1): 247–252.
Published online Dec 27, 2002. doi:  10.1073/pnas.232686799
PMCID: PMC140941

Identification of pathogen-specific and conserved genes expressed in vivo by an avian pathogenic Escherichia coli strain


Escherichia coli is a diverse bacterial species that comprises commensal nonpathogenic strains such as E. coli K-12 and pathogenic strains that cause a variety of diseases in different host species. Avian pathogenic E. coli strain χ7122 (O78:K80:H9) was used in a chicken infection model to identify bacterial genes that are expressed in infected tissues. By using the cDNA selection method of selective capture of transcribed sequences and enrichment for the isolation of pathogen-specific (non-E. coli K-12) transcripts, pathogen-specific cDNAs were identified. Pathogen-specific transcripts corresponded to putative adhesins, lipopolysaccharide core synthesis, iron-responsive, plasmid- and phage-encoded genes, and genes of unknown function. Specific deletion of the aerobactin siderophore system and E. coli iro locus, which were identified by selective capture of transcribed sequences, demonstrated that these pathogen-specific systems contribute to the virulence of strain χ7122. Consecutive blocking to enrich for selection of pathogen-specific genes did not completely eliminate the presence of transcripts that corresponded to sequences also present in E. coli K-12. These E. coli conserved genes are likely to be highly expressed in vivo and contribute to growth or virulence. Overall, the approach we have used simultaneously provided a means to identify novel pathogen-specific genes expressed in vivo and insight regarding the global gene expression and physiology of a pathogenic E. coli strain in a natural animal host during the infectious process.

Escherichia coli is an adaptive bacterial species that is both a commensal resident of the intestine and a versatile pathogen of humans and other animals. Diseases associated with pathogenic E. coli include enteric infections caused by diarrheagenic E. coli and extraintestinal infections, such as urinary tract infections, meningitis, and septicemia caused by extraintestinal pathogenic E. coli (ExPEC) (1, 2). Specific E. coli pathotypes cause particular pathologies in different animal species. This host adaptation and virulence capacity are attributed to the horizontal acquisition of specific genes that are generally absent from the genomes of nonpathogenic E. coli strains, such as the laboratory strain E. coli K-12. Pathogen-specific genes are located on plasmids, bacteriophages, or discrete regions of DNA that have been termed pathogenicity islands (3, 4). Regions absent from E. coli K-12 that have not been confirmed to contribute to virulence have been termed unique sequence islands (3). Comparison of the complete genomes of E. coli K-12 strain MG1655 and enterohemorrhagic O157:H7 strain EDL933 demonstrate that both strains have genomes containing distinct genetic regions (46). The capacity of bacteria such as E. coli to acquire and delete DNA regions has permitted adaptation to new niches and the evolution of diverse virulence mechanisms (4, 7). The genomes of pathogenic E. coli are typically larger than those of K-12 strains (8), and pathogen-specific regions have been shown to contribute directly to virulence (911).

Although the presence of pathogen-specific genes may dictate the pathogenic lifestyle and virulence of particular bacteria, products encoded by conserved or “core” genes undoubtedly contribute to metabolism, physiology, and adaptation to environmental changes. In E. coli, ≈75–90% of the genome is conserved, as demonstrated by comparison of E. coli strains MG1655 (K-12) and EDL933 (O157:H7) (6). Some conserved gene products are essential for cell physiology or survival under conditions of stress that may include adaptation to host environments or resistance to host defenses. In other cases, conserved genes may encode regulators that control the expression of pathogen-specific virulence-associated genes. Examples of such include RpoS (12), Dam methylase in Salmonella and pathogenic E. coli (13, 14), and LuxS in enterohemorrhagic E. coli (15).

Avian pathogenic E. coli (APEC) cause extraintestinal infections, including respiratory infection (airsacculitis and pneumonia), pericarditis, perihepatitis, and septicemia of poultry (16). Predominant serotypes of E. coli associated with these infections are O1:K1, O2:K1, and O78:K80. APEC most likely enter and colonize the air sacs through inhalation of feces-contaminated dust (17). In certain cases, the bacteria spread systemically and cause fatal septicemia. Certain factors have been associated with the virulence of APEC, including the aerobactin iron-sequestering system, temperature-sensitive hemagglutinin (Tsh), Type 1 and P fimbriae, and Colicin-V plasmids (16, 18). Brown and Curtiss (11) used subtractive DNA hybridization to identify regions in the genome of APEC strain χ7122 (O78:H7:K80) that are absent from E. coli K-12. Twelve unique sequence islands (USIs) were identified in strain χ7122. By using P1 bacteriophage-mediated transduction, four of these USIs were replaced with the corresponding region of E. coli K-12. Two of these USIs, the 45.0-min region encoding the O78 antigen and an uncharacterized region located at 0 min, contributed to the virulence of APEC χ7122.

Direct screening of bacterial genes expressed during infection of the host is limited, because isolation of bacterial transcripts from host tissues necessitates separation from the abundance of host RNA. Recently, selective capture of transcribed sequences (SCOTS) has been used to identify bacterial genes expressed within macrophages (1922). SCOTS allows the selective capture of bacterial cDNA derived from infected cells or tissues using hybridization to biotinylated bacterial genomic DNA. In this report, we have used SCOTS to preferentially isolate pathogen-specific (non-E. coli K-12) cDNAs representing APEC genes that are expressed within the tissues of experimentally infected poultry. Consecutive blocking to enrich for selection of APEC-specific genes did not completely eliminate the presence of transcripts that correspond to sequences also present in E. coli K-12. These E. coli conserved genes are likely to be highly expressed in host tissues and contribute to bacterial growth or virulence.

Materials and Methods

Bacterial Strains, Plasmids, and Culture Conditions.

Bacterial strains and plasmids are presented in Table Table1.1. The virulent APEC strain χ7122 (O78:H9:K80) (23) was used for determination of in vivo expressed genes, infection studies, and mutant construction. Cells were routinely grown in Lennox broth (26) at 37°C. Solid medium contained 1.5% agar. For infection experiments, strains were grown for 24 h in beef heart infusion broth (Difco) at 37°C. Antibiotics, when required, were used at the following final concentrations (μg/ml): 10 tetracycline, 30 chloramphenicol, 25 kanamycin, and 12.5 nalidixic acid.

Table 1
E. coli strains and plasmids

General Molecular Techniques.

Bacterial genomic DNA was prepared by using a small-scale method (27). Restriction endonucleases and DNA-modifying and ligase enzymes (New England Biolabs; Promega) were used according to the manufacturers' guidelines. Standard techniques were used for bacterial conjugation and transformation (28).

Experimental Infection, RNA Isolation, cDNA Synthesis, and Amplification.

Three-week-old white leghorn specific-pathogen-free chickens (Charles River Breeding Laboratories) were inoculated into the right thoracic air sac with a 0.1-ml suspension containing 1 × 108 colony-forming units of strain χ7122. Total RNA was isolated from infected tissues (pericardium and air sacs) 6 or 24 h postinfection by using TRIzol reagent (Invitrogen). RNA samples were treated with RNase-free DNase I (Ambion, Austin, TX). RNA concentrations and integrity were determined by A260/A280 spectrophotometer readings and agarose gel electrophoresis, respectively. Five 1-μg samples of total RNA obtained from tissues of five individual chickens were pooled and converted to first-strand cDNAs by random priming with Superscript II reverse transcriptase (Invitrogen). Random priming was performed as described (29) by using oligonucleotides containing terminal sequences at the 5′ end (5′-CGGGATCCAGCTTCTGACGCA-3′ for cDNAs from the air sacs and 5′-GTGGTACCGCTCTCCGTCCGA-3′ for cDNAs from the pericardium) and a random nonamer at the 3′ end (PCR primer-dN9). cDNAs were made double-stranded with Klenow fragment (NEB, Beverly, MA) as described (29). cDNA was then amplified by PCR by using the defined primers for each set of cDNA for 25 cycles.


Bacterial cDNA capture was done as described previously (19, 20). Briefly, denatured, biotinylated, and sonicated E. coli (χ7122) genomic DNA fragments (0.3 μg) were mixed with sonicated rrnB DNA (pC6) (5 μg) to preblock ribosomal RNA-encoding regions of the genomic DNA. After hybridization at 67°C for 30 min, denatured amplified cDNA (3 μg) from infected tissues was added, and hybridized for 18 h at 67°C. Bacterial cDNAs that hybridized to biotinylated genomic DNA were retained by binding hybrids to streptavidin-coated M280 magnetic beads (Dynal, Bethlehem, PA). Captured cDNA was then eluted, precipitated, and amplified by PCR. In the first round of SCOTS, five separate samples of cDNA were captured by hybridization to biotinylated rDNA-blocked chromosomal DNA in parallel reactions. This was done to enhance the likelihood of recovering cDNA molecules corresponding to a more complete diversity of bacterial transcripts present in host tissues. After the first round of SCOTS, the five amplified cDNA preparations for each type of tissue sampled were combined, denatured, and again hybridized to rDNA-blocked biotinylated chromosomal DNA for two successive rounds of SCOTS. The mixtures were then used to verify successful capture of E. coli sequences by cloning and for competitive hybridization enrichment as described below.

Enrichment for Pathogen-Specific Bacterial cDNAs.

To enrich the bacterial cDNA populations for APEC-specific (non-E. coli K-12) sequences, E. coli χ7122 biotinylated genomic DNA (0.3 μg) was first preblocked with an excess of denatured sheared genomic DNA (10 μg) from E. coli K-12 strain χ289. Bacterial cDNAs that hybridized to biotinylated genomic cDNA were retained by binding to streptavidin-coated magnetic beads (Dynal). Bacterial cDNAs were eluted, precipitated, and PCR amplified by using specific terminal sequences. After three rounds of competitive hybridization, bacterial cDNAs were cloned by using the Original TA Cloning kit (Invitrogen). Cloned inserts were sequenced by using the ABI Prism Big Dye primer cycle sequencing kit (PE Applied Biosystems). Sequences were compared with the complete genome of E. coli K-12 (5), and database comparisons were carried out with the blast algorithm (30).

Construction of Defined Mutations in Pathogen-Specific Genes of APEC Strain χ7122.

Mutants of strain χ7122 were generated by deletion of pathogen-specific genes that were identified by SCOTS. Generation of strain χ7273 by inactivation of tsh, which encodes the Tsh/hemoglobin protease is described elsewhere (18). A suicide vector for deletion of the aerobactin encoding gene cluster (iucABCDiutA) was constructed as follows. A 833-bp fragment of the 5′ end of iucA was generated by PCR by using primers aero1 (5′-gctctagattatgatcctgccctctg-3′; added XbaI site underlined) and aero2 (5′-ttgcggccgctggtagcacagtagagg-3′; added NotI site underlined), and a 1,200-bp fragment of the 3′ end of iutA was generated by PCR by using primers aero3 (5′-ttgcggccgcactgacgggcatttga-3′; added NotI site underlined) and aero4 (5′- atgcatgctgaagctgagtgtacc-3′; added SphI site underlined). These two fragments and a NotI-NotI xylE gene cassette from pMEG685 were cloned into the XbaI and SphI sites of pMEG-375. A resultant suicide vector containing the iucA′-xylE-′iutA fragment was named pYA3662. A suicide vector for deletion of the iroBCDEN genes was constructed as follows. A 663-bp fragment of the 5′ end of iroB was generated by PCR by using primers iroBKO1 (5′-aggcgcgcctctctatgggc-3′; added AscI site underlined) and iroBKO2 (5′-ctctagatcaaggccgtcaacc-3′; added XbaI site underlined), and a 609-bp fragment of the 5′ end of iroN was generated by PCR by using primers iroNKO1 (5′- aagcatgctcctggttgggttgaata-3′; added SphI site underlined) and iroNKO2 (5′-ctctagagcattaccagccagagg-3′; added XbaI site underlined). These two fragments and a XbaI-XbaI npt II gene cassette from pBSL86 were cloned into the AscI and SphI sites of pMEG-375. A resultant suicide vector containing the iroB′-npt II-iroN′ fragment was named pYA3663. Allelic replacements were obtained as described elsewhere (18).

Virulence Studies.

Three-week-old white leghorn specific-pathogen-free chickens were inoculated into the right thoracic air sac with a 0.1-ml suspension containing 1 × 107 colony-forming units of strain χ7122 or mutant derivatives. Birds were euthanized 48 h postinfection, and bacterial quantification in tissues was performed as described previously (18).


Selective Capture of APEC Transcripts Expressed in Tissues of Infected Chickens.

After infection, pericardial and air sac tissues were used to isolate RNA and to quantitate bacteria. Bacterial counts varied from 104 to 107 colony-forming units/gram of tissue 6 or 24 h postinfection. We first isolated total E. coli transcripts from infected tissues of chickens by using SCOTS. This sampling of bacterial cDNAs represented total transcripts produced by bacteria within the tissues of infected chickens. Initial screening of some of the cDNA sequences isolated from these pools demonstrated that the majority represented sequences present in E. coli K-12. Identified sequences corresponded to fhuF, ompX, mdh, ybiL (fiu), yegE, yhdG, yfhA, yghU, and the IS2 transposase gene. We therefore did not pursue further identification of sequences from these cDNA pools. These results confirmed, however, that by using SCOTS, E. coli-specific transcripts were successfully isolated from tissues of infected chickens.

Enrichment for Identification of Pathogen-Specific Sequences Expressed in Vivo.

To identify transcripts that were specific to APEC strain χ7122, we subjected the total bacterial cDNA pools to three further rounds of SCOTS in the presence of a >33-fold excess of unlabeled E. coli K-12 DNA. An abundance of unlabeled E. coli K-12 DNA was expected to reduce the capture of sequences common to APEC χ7122 and E. coli K-12 and promote the capture of APEC χ7122-specific transcripts. After this blocking step, cDNAs were cloned. Randomly chosen cDNA clones derived from transcripts in the air sacs or pericardium were selected and sequenced. The clones, termed E. coli captured sequences (ecs), contained numerous APEC-specific sequences (Table (Table2).2). Despite three rounds of blocking to reduce the presence of sequences common to E. coli K-12 and APEC strain χ7122, one-third of the ecs clones contained fragments of genes corresponding to E. coli K-12 sequences (Table (Table3).3). Among the 66 distinct ecs clones, only one clone was identified twice from sequenced samples. Forty-four of the 66 distinct clones were APEC-specific. Pathogen-specific clones contained sequences homologous to known and novel putative bacterial gene products involved in adherence, iron transport, lipopolysaccharide (LPS) synthesis, plasmid replication and conjugation, putative phage encoded products, and gene products of unknown function (Table (Table2).2). More detailed information regarding homologs of the pathogen-specific ecs clones and ecs clones corresponding to E. coli conserved genes is published as supporting information on the PNAS web site (www.pnas.org).

Table 2
E. coli pathogen-specific ecs clones identified by SCOTS
Table 3
E. coli conserved genes expressed in vivo identified by SCOTS

Role of Pathogen-Specific Genes Encoded by Virulence Plasmid pAPEC-1.

Plasmid pAPEC-1, a Colicin V-type plasmid, is required for the virulence of strain χ7122 in chickens (18). Genomic analyses demonstrated that the genes encoding Tsh, aerobactin, and the Iro system, which were all identified by SCOTS, are encoded by plasmid pAPEC-1. These three systems may contribute to acquisition of iron by APEC strain χ7122, because aerobactin is a pathogen-associated iron sequestering system, Tsh is a hemoglobin protease, and the uncharacterized Iro system encodes a putative siderophore receptor. To examine the role of these in vivo expressed pathogen-specific systems, mutant derivatives of strain χ7122 were generated and tested in the avian infection model. The iro locus was cloned from HindIII-digested fragments of plasmid pAPEC-1 and was sequenced (GenBank accession no. AF449498). The aerobactin, Iro, and/or Tsh encoding genes were deleted or inactivated in derivatives of pathogenic strain χ7122 and were tested for virulence in chickens. The iro mutant was tested to determine the individual role of this system for virulence and persistence in chickens compared with a mutant lacking Tsh and aerobactin, the other two known pathogen-specific systems implicated in APEC virulence, and a triple mutant (Iro, Tsh, Aerobactin) lacking all three systems. Bacterial numbers were significantly reduced in the lungs, livers, and spleens of chickens infected with either the Aerobactin, Tsh mutant (χ7301) or the Iro mutant (χ7303) compared with the wild-type parent strain (χ7122) 48 h after infection (Fig. (Fig.1).1). In addition, strains χ7301 and χ7303 exhibited reduced lesions of airsacculitis and only mild pericarditis and perihepatitis compared with the wild-type parent. Although χ7301 is also Tsh, Tsh alone is not required for persistence of strain χ7122 in deeper tissues of chickens (18). As well, an Aerobactin derivative of χ7122 persists in deeper tissues at a level similar to that of the Tsh Aerobactin strain χ7301 (C.M.D. and R.C., unpublished data). The Tsh, Iro, and Aerobactin mutant (χ7306) did not survive in extraintestinal tissues 48 h postinfection (Fig. (Fig.1)1) and caused minor lesions that were limited to the air sac site of inoculation. To confirm the role of the iro locus for APEC virulence, plasmid pYA3661 containing the iroBCDEN genes was introduced into strains χ7303 (Iro) and χ7306 (Iro, Tsh, Aerobactin). Strains χ7303 and χ7306 containing pYA3661 persisted in the liver (Fig. (Fig.11C) and caused lesions of pericarditis and perihepatitis to the same extent as the wild-type APEC strain χ7122.

Figure 1
Bacterial numbers present in the lungs (A), spleens (B), and livers (C) of chickens infected with wild-type APEC strain χ7122 or isogenic mutant derivatives. Data points represent bacterial counts from tissues isolated from different chickens ...

Identification of in Vivo Expressed E. coli Genes Common to APEC and K-12 Strains.

Twenty-two of the ecs clones identified after enrichment for isolation of pathogen-specific sequences represented transcripts that correspond to genes present in both APEC strain χ7122 and E. coli K-12 (Table (Table3).3). These ecs clones corresponded to genes required for metabolism, cell growth, stress response, or regulation, and genes of unknown function. Because these conserved transcripts remained after pathogen-specific cDNA enrichment, it is likely that they are abundantly expressed in infected tissues and were therefore not as effectively eliminated from the cDNA pools. This indicates that in addition to identifying pathogenic E. coli-specific sequences, the enrichment procedure favored the isolation of E. coli conserved genes that are likely to be highly expressed in vivo. Therefore, it is likely that the E. coli conserved genes identified by SCOTS after enrichment for pathogen-specific E. coli genes represent important metabolic or regulatory systems expressed in vivo, and many of these genes are also likely to be essential for bacterial growth or for full virulence.


Determination of bacterial genes that are expressed during infection or that are essential or required for virulence in the host may provide valuable information to prevent and control infectious diseases. In recent years, identification of in vivo expressed or in vivo required bacterial genes has been achieved by using a number of different approaches. The most commonly used methods have included signature-tagged mutagenesis (STM), a negative-selection method that involves comparative isolation of individual specifically tagged transposon-generated mutants from pools of mutants propagated in vitro and in vivo (32), and promoter fusion methods such as in vivo expression technology (IVET) (33). IVET positively selects for in vivo promoter activity that provides transcription of a gene product required for survival in vivo. In addition, differential fluorescence induction (DFI), a promoter trap-gfp (green fluorescent protein) fusion-based differential selection method, identifies bacterial genes induced within host cells by determining comparative levels of fluorescence by flow cytometry (34). Recently, a bacterial transcript analysis method termed SCOTS has been used to identify genes expressed by Mycobacterium spp. (19, 22) and S. enterica in phagocytes (20, 21). In this report, we have extended the use of SCOTS to determine in vivo expressed bacterial genes of a pathogenic E. coli strain within a natural host experimental animal model. With the aim of discovering pathogen-specific genes that are expressed in vivo, we directly enriched the pool of in vivo expressed genes to contain pathogen-specific (non-E. coli K-12) gene transcripts and demonstrated by generation of defined-deletion mutants that the pathogen-specific aerobactin-encoding and iro gene clusters were required for full virulence. Notably, many of the APEC-specific and conserved E. coli genes or their homologs that were identified by SCOTS were identified by STM, in vivo expression technology, and/or DFI in different bacterial pathogens (see supporting information on the PNAS web site).

APEC-specific transcripts isolated from host tissues exhibited homologies to genes involved in adherence, iron transport, LPS synthesis, plasmid replication and conjugation, putative phage-encoded products, and gene products of unknown function (Table (Table2).2). Many of these genes represent newly identified sequences, whereas others have been previously associated with virulence in E. coli. Transcript ecs-5 isolated from the air sacs corresponded to tsh (hbp). Tsh (Hbp), a Tsh/hemoglobin protease, was the first enterobacterial serine-protease autotransporter to be identified (35) and is associated with virulent avian E. coli (18) and human ExPEC (36). Identification of the expression of tsh in the air sacs correlates with our infection studies that demonstrated a role for Tsh in the development of lesions in the air sacs (18).

SCOTS clones ecs-6 and ecs-7 corresponded to genes involved in the synthesis of the R1-type core LPS, and ecs-52 corresponded to kdsA, a conserved gene that is required for synthesis of 3-deoxy-d-manno-octulosonic acid (KDO), an essential component of the LPS inner core. LPS is of major importance to pathogenic bacteria, because it can contribute to resistance to complement and phagocytic cells. The O-antigen of strain χ7122 was previously shown to be required for virulence (11). The identification of SCOTS clones corresponding to iron-regulated pathogen-specific genes (ecs-8 to ecs-12) is in line with the fact that in host tissues iron availability is limited. Clones corresponding to genes of the aerobactin and Iro systems were identified. Deletion of the iro gene cluster and/or aerobactin encoding systems clearly demonstrated their importance for persistence and generation of lesions in deeper tissues by strain χ7122 (Fig. (Fig.1).1). Aerobactin is a siderophore associated with ExPEC from humans and is also associated with virulent APEC strains. Aerobactin has also been demonstrated to contribute to the virulence of extraintestinal E. coli strains in the mouse urinary tract infection (37) and ovine oral infection models (38).

The E. coli iroBCDEN gene cluster encodes putative proteins homologous to those encoded by the iro locus of S. enterica (39). In S. enterica, the iro locus is derepressed under iron-limiting conditions (39) and in ExPEC strain CP9, iroN expression was shown to be repressed by iron (40). Previous to this report, a direct contribution of the iro locus for virulence of E. coli has not been demonstrated. Inactivation of the Tsh, Iro, and aerobactin systems greatly reduced in vivo persistence (Fig. (Fig.1)1) and development of lesions in respiratory and deeper tissues. Reintroduction of the iro genes on a multicopy plasmid restored the capacity to persist in vivo to the Aerobactin, Iro, Tsh mutant to levels similar to that of wild-type strain χ7122. The increased copy number of the iro genes from the p15A replicon is most likely responsible for an overt regain in virulence in the Aerobactin, Iro, Tsh mutant. However, these complementation results suggest that the pAPEC-1 encoded aerobactin and Iro systems function in concert to increase bacterial acquisition of iron within host extraintestinal tissues.

The most common pathogen-specific transcripts isolated were homologous to genes associated with ColE2-related plasmids (ecs-13 to ecs-28). ColE2-related plasmids are small (5–11 kb) multicopy (10–20 copies per cell) plasmids (41). The higher copy level of a ColE2-related plasmid present in χ7122 would lead to abundant levels of corresponding transcript. This is most likely why sequences corresponding to this plasmid were frequently isolated by SCOTS. Reduced isolation of clones corresponding to the ColE2-related plasmid could have been obtained by including an excess of unlabeled ColE2-related plasmid DNA along with E. coli K-12 DNA during capture rounds. Transcripts corresponding to tra genes and other F plasmid-related genes (ecs-29 to ecs-32) were also identified by SCOTS. F-pilus export and plasmid transfer comprise a type IV secretion system. Many type IV secretion systems are critical factors for the virulence of bacterial pathogens (42), although a role for F-pili in the virulence of E. coli has not been demonstrated. Bacteriophage-related genes were also identified by SCOTS. Bacteriophage-associated genes may code for virulence properties such as toxins and adhesins. It is currently not known whether phage-encoded genes are associated with the virulence of APEC strain χ7122. Among the APEC-specific transcripts of unknown function, ecs-43 is highly similar to the core gene of a non-E. coli K-12 rearrangement hotspot element RhsH (43). Interestingly, in E. coli K-12, Rhs core ORFs are not expressed to a detectable extent during routine cultivation, and despite their presence in many E. coli strains, the function of these elements has not been assessed.

Some representative SCOTS clones isolated after selective enrichment for pathogen-specific genes corresponded to genes that are also common to E. coli K-12. Importantly, among the ecs clones that corresponded to E. coli K-12 genes, many of these genes have been identified as preferentially expressed or required in vivo or have been shown to be essential genes (details are published as supporting information on the PNAS web site). Notably, purA and pyrG are required for synthesis of purines and pyrimidines, respectively, in E. coli. The importance of synthesis of nucleotide precursors for the virulence of bacteria in vivo is well established. purA (ecs-49) was selected as the initial null mutation and gene fusion to be used for in vivo gene expression analysis by in vivo expression technology in Salmonella typhimurium (33), because it is required for bacterial survival and virulence in vivo. Genes implicated in regulation and stress response were also identified by SCOTS. uspB (ecs-62) (universal stress protein B) is a σS-dependent gene that contributes to resistance to ethanol (44). topA (ecs-61) encodes DNA topoisomerase I, which contributes to DNA supercoiling, global gene transcription, bacterial survival against high osmolarity (45), and oxidative damage (46). phoB (ecs-60) encodes the global response regulator of the phosphate regulon. PhoB has been shown to regulate hilA and invasion genes in S. typhimurium (47), and a phoB mutant of V. cholorae is less able to colonize the rabbit intestine (48). Although phoB has not been shown to contribute to virulence in E. coli, a pathogenic E. coli strain causing septicemia in swine that is defective for high-affinity phosphate transport was reduced in virulence (49). Among the genes of unknown function identified by SCOTS, ynfK (ecs-66) encodes a putative dethibiotin synthase. Interestingly, avidin, which is a biotin-binding protein, is induced in avian tissues after E. coli infection (50), and therefore biotin limitation may be an innate host response against infection.

The isolation of certain E. coli K-12 genes after selective reduction of these sequences implicitly suggests these genes are highly transcribed in vivo. Higher transcription levels would result in an abundance of sequences in the SCOTS cDNA pool, and therefore removal of these abundant products would be less likely even after removal of some of these transcripts with an excess of E. coli K-12 genomic DNA. The pertinence of most of these conserved genes for survival in vivo also supports the likelihood that a greater amount of transcription of these genes may occur in vivo. Further experiments, such as quantitative real-time PCR or microarray analyses, will be required to confirm whether these E. coli conserved genes are actually expressed at high levels in vivo in infected tissues or during growth in vitro.

The current study provides only a glimpse of some of the bacterial genes that are transcribed inside the host during E. coli infection, because only a fraction of the clones that were isolated by SCOTS have been analyzed so far. Moreover, because the major purpose of this study was to identify pathogen-specific genes that were expressed in host tissues, we have not addressed whether these genes are expressed specifically or preferentially in vivo or in a particular tissue location. However, such analyses could readily be determined by selective enrichment of sequences transcribed under specific conditions, as has been the focus of other reports using SCOTS (1921). The approach we have presented herein could also be used to identify pathogen-specific genes that are transcribed in vivo in other types of pathogenic E. coli or closely related Shigella spp. in different animal or cell culture models. In addition, for other bacterial species, differences in gene transcription in host cells or tissues could be established by comparative blocking between different strains belonging to the same or similar species with high overall DNA homology but that have different degrees of virulence or host specificities. For instance, comparative in vivo gene expression of S. typhimurium vs. Salmonella typhi or Yersinia pseudotuberculosis vs. Y. pestis in appropriate infection models. Identification of pathogen-specific and conserved bacterial genes that are expressed in vivo will provide further insight into the mechanisms by which bacteria colonize host tissues, cope with, or circumvent host defenses and adjust to the nutrient limitations and other stresses that occur in different host environments.

Supplementary Material

Supporting Information:


We thank C. Squires (Tufts University, Boston) for plasmid pC6, Brian Morrow for helpful suggestions, and J. Clark-Curtiss for reviewing the manuscript, and we gratefully acknowledge Daryoush Saeed-Vafa for isolation of the DNA clone encompassing the iro gene cluster. This work was supported by U.S. Department of Agriculture National Research Initiative Competitive Grants Program Grants 97-35204-4512 and 00-35204-9224 (to R.C. and C.M.D.) and by National Institutes of Health Grant AI24533 (to R.C.). C.M.D. and F.D. were supported by Canadian Natural Sciences and Engineering Research Council postdoctoral fellowships.


avian pathogenic Escherichia coli
extraintestinal pathogenic E. coli
selective capture of transcribed sequences
temperature-sensitive hemagglutinin


Data deposition: The sequence reported in this paper has been deposited in the GenBank database (accession no. AF449498).


1. Sussman M, editor. Escherichia coli: Mechanisms of Virulence. Cambridge, U.K.: Cambridge Univ. Press; 1997.
2. Russo T A, Johnson J R. J Infect Dis. 2000;181:1753–1754. [PubMed]
3. Dozois C M, Curtiss R., III Vet Res. 1999;30:157–179. [PubMed]
4. Ochman H, Lawrence J G, Groisman E A. Nature. 2000;405:299–304. [PubMed]
5. Blattner F R, Plunkett G, III, Bloch C A, Perna N T, Burland V, Riley M, Collado-Vides J, Glasner J D, Rode C K, Mayhew G F, et al. Science. 1997;277:1453–1474. [PubMed]
6. Perna N T, Plunkett G, III, Burland V, Mau B, Glasner J D, Rose D J, Mayhew G F, Evans P S, Gregor J, Kirkpatrick H A, et al. Nature. 2001;409:529–533. [PubMed]
7. Reid S D, Herbelin C J, Bumbaugh A C, Selander R K, Whittam T S. Nature. 2000;406:64–67. [PubMed]
8. Bergthorsson U, Ochman H. Mol Biol Evol. 1998;15:6–16. [PubMed]
9. Jarvis K G, Giron J A, Jerse A E, McDaniel T K, Donnenberg M S, Kaper J B. Proc Natl Acad Sci USA. 1995;92:7996–8000. [PMC free article] [PubMed]
10. Bloch C A, Rode C K. Infect Immun. 1996;64:3218–3223. [PMC free article] [PubMed]
11. Brown P K, Curtiss R., III Proc Natl Acad Sci USA. 1996;93:11149–11154. [PMC free article] [PubMed]
12. Fang F C, Libby S J, Buchmeier N A, Loewen P C, Switala J, Harwood J, Guiney D G. Proc Natl Acad Sci USA. 1992;89:11978–11982. [PMC free article] [PubMed]
13. Garcia-Del Portillo F, Pucciarelli M G, Casadesus J. Proc Natl Acad Sci USA. 1999;96:11578–11583. [PMC free article] [PubMed]
14. Low D A, Weyand N J, Mahan M J. Infect Immun. 2001;69:7197–7204. [PMC free article] [PubMed]
15. Sperandio V, Mellies J L, Nguyen W, Shin S, Kaper J B. Proc Natl Acad Sci USA. 1999;96:15196–15201. [PMC free article] [PubMed]
16. Dho-Moulin M, Fairbrother J M. Vet Res. 1999;30:299–316. [PubMed]
17. Carlson H C, Whenham G R. Avian Dis. 1968;12:297–302. [PubMed]
18. Dozois C M, Dho-Moulin M, Bree A, Fairbrother J M, Desautels C, Curtiss R., III Infect Immun. 2000;68:4145–4154. [PMC free article] [PubMed]
19. Graham J E, Clark-Curtiss J E. Proc Natl Acad Sci USA. 1999;96:11554–11559. [PMC free article] [PubMed]
20. Morrow B J, Graham J E, Curtiss R., III Infect Immun. 1999;67:5106–1516. [PMC free article] [PubMed]
21. Daigle F, Graham J E, Curtiss R., III Mol Microbiol. 2001;41:1211–1222. [PubMed]
22. Hou J Y, Graham J E, Clark-Curtiss J E. Infect Immun. 2002;70:3714–3726. [PMC free article] [PubMed]
23. Provence D L, Curtiss R., III Infect Immun. 1992;60:4460–4467. [PMC free article] [PubMed]
24. Chang A C, Cohen S N. J Bacteriol. 1978;134:1141–1156. [PMC free article] [PubMed]
25. Alexeyev M F. BioTechniques. 1995;18:52–56. [PubMed]
26. Lennox E S. Virology. 1955;1:190–206. [PubMed]
27. Ausubel F M, Brent R, Kingston R E, Moore D M, Seidman J G, Smith J A, Struhl K. Molecular Cloning: A Laboratory Manual. New York: Greene and Wiley Interscience; 1991.
28. Miller J H. A Short Course in Bacterial Genetics: A Laboratory Manual for Escherichia coli and Related Bacteria. Plainview, NY: Cold Spring Harbor Lab. Press; 1992.
29. Froussard P. Nucleic Acids Res. 1992;20:2900. [PMC free article] [PubMed]
30. Altschul S F, Gish W, Miller W, Myers E W, Lipman D J. J Mol Biol. 1990;215:403–410. [PubMed]
31. Neidhardt F C, Curtiss R, III, Ingraham J L, Lin E C C, Low K B, Magasanik B, Reznikoff W S, Riley M, Schaechter M, Umbarger H E. Escherichia coli and Salmonella: Molecular and Cellular Biology. Washington, DC: Am. Soc. Microbiol.; 1996.
32. Hensel M, Shea J E, Gleeson C, Jones M D, Dalton E, Holden D W. Science. 1995;269:400–403. [PubMed]
33. Mahan M J, Slauch J M, Mekalanos J J. Science. 1993;259:686–688. [PubMed]
34. Valdivia R H, Falkow S. Science. 1997;277:2007–2011. [PubMed]
35. Provence D L, Curtiss R., III Infect Immun. 1994;62:1369–1380. [PMC free article] [PubMed]
36. Otto B R, van Dooren S J, Dozois C M, Luirink J, Oudega B. Infect Immun. 2002;70:5–10. [PMC free article] [PubMed]
37. Torres A G, Redford P, Welch R A, Payne S M. Infect Immun. 2001;69:6179–6185. [PMC free article] [PubMed]
38. Der Vartanian M, Jaffeux B, Contrepois M, Chavarot M, Girardeau J P, Bertin Y, Martin C. Infect Immun. 1992;60:2800–2807. [PMC free article] [PubMed]
39. Baumler A J, Norris T L, Lasco T, Voight W, Reissbrodt R, Rabsch W, Heffron F. J Bacteriol. 1998;180:1446–1453. [PMC free article] [PubMed]
40. Russo T A, Carlino U B, Mong A, Jodush S T. Infect Immun. 1999;67:5306–5314. [PMC free article] [PubMed]
41. Mock M, Pugsley A P. J Bacteriol. 1982;150:1069–1076. [PMC free article] [PubMed]
42. Christie P J, Vogel J P. Trends Microbiol. 2000;8:354–360. [PubMed]
43. Wang Y D, Zhao S, Hill C W. J Bacteriol. 1998;180:4102–4110. [PMC free article] [PubMed]
44. Farewell A, Kvint K, Nystrom T. J Bacteriol. 1998;180:6140–6147. [PMC free article] [PubMed]
45. Bhriain N N, Dorman C J. Mol Microbiol. 1993;7:351–358. [PubMed]
46. Tse-Dinh Y C. J Bacteriol. 2000;182:829–832. [PMC free article] [PubMed]
47. Lucas R L, Lostroh C P, DiRusso C C, Spector M P, Wanner B L, Lee C A. J Bacteriol. 2000;182:1872–1882. [PMC free article] [PubMed]
48. von Kruger W M, Humphreys S, Ketley J M. Microbiology. 1999;145:2463–2475. [PubMed]
49. Daigle F, Fairbrother J M, Harel J. Infect Immun. 1995;63:4924–4927. [PMC free article] [PubMed]
50. Elo H A, Raisanen S, Tuohimaa P J. Experientia. 1980;36:312–313. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...