![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||||||||||
Copyright : © 2006 Dunning Hotopp. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Comparative Genomics of Emerging Human Ehrlichiosis Agents 1 The Institute for Genomic Research, Rockville, Maryland, United States of America 2 Department of Veterinary Biosciences, College of Veterinary Medicine, The Ohio State University, Columbus, Ohio, United States of America 3 J. Craig Venter Joint Technology Center, Rockville, Maryland, United States of America Paul M. Richardson, Editor The US DoE Joint Genome Institute, United States of America * To whom correspondence should be addressed. E-mail: jdunning/at/tigr.org ¤a Current address: Developmental and Regenerative Neurobiology Program, Department of Neurology, Institute of Molecular Medicine and Genetics, Medical College of Georgia, Augusta, Georgia, United States of America ¤b Current address: Laboratory of Environmental Microbiology, Institute for Environmental Sciences, Suruga, Shizuoka, Japan ¤c Current address: National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America ¤d Current address: Eastern Virginia Medical School, Norfolk, Virginia, United States of America Received October 20, 2005; Accepted January 9, 2006. This article has been corrected. See PLoS Genet. 2006 December 29; 2(12): e213. This article has been cited by other articles in PMC.Abstract Anaplasma (formerly Ehrlichia) phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia (formerly Ehrlichia) sennetsu are intracellular vector-borne pathogens that cause human ehrlichiosis, an emerging infectious disease. We present the complete genome sequences of these organisms along with comparisons to other organisms in the Rickettsiales order. Ehrlichia spp. and Anaplasma spp. display a unique large expansion of immunodominant outer membrane proteins facilitating antigenic variation. All Rickettsiales have a diminished ability to synthesize amino acids compared to their closest free-living relatives. Unlike members of the Rickettsiaceae family, these pathogenic Anaplasmataceae are capable of making all major vitamins, cofactors, and nucleotides, which could confer a beneficial role in the invertebrate vector or the vertebrate host. Further analysis identified proteins potentially involved in vacuole confinement of the Anaplasmataceae, a life cycle involving a hematophagous vector, vertebrate pathogenesis, human pathogenesis, and lack of transovarial transmission. These discoveries provide significant insights into the biology of these obligate intracellular pathogens. Synopsis Ehrlichiosis is an acute disease that triggers flu-like symptoms in both humans and animals. It is caused by a range of bacteria transmitted by ticks or flukes. Because these bacteria are difficult to culture, however, the organisms are poorly understood. The genomes of three emerging human pathogens causing ehrlichiosis were sequenced. A database was designed to allow the comparison of these three genomes to sixteen other bacteria with similar lifestyles. Analysis from this database reveals new species-specific and disease-specific genes indicating niche adaptations, pathogenic traits, and other features. In particular, one of the organisms contains more than 100 copies of a single gene involved in interactions with the host(s). These comparisons also enabled a reconstruction of the metabolic potential of five representative genomes from these bacteria and their close relatives. With this work, scientists can study these emerging pathogens in earnest. Introduction Anaplasma phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia sennetsu are small (approximately 0.4–1.5 μm), pleomorphic α-Proteobacteria. These bacteria are human pathogens that replicate in membrane-bound compartments inside host granulocytes (A. phagocytophilum) or monocytes/macrophages (E. chaffeensis and N. sennetsu) [1–3]. They are obligate intracellular pathogens with a life cycle that involves both vertebrate and invertebrate hosts. A. phagocytophilum and E. chaffeensis depend on hematophagous ticks as vectors and wild mammals as reservoir hosts (Table 1) [2,4]. Unknown trematodes are suspected to be the vector and reservoir of N. sennetsu [1]. No vaccine exists for any of these human pathogens.
A. phagocytophilum is the causative agent of human granulocytic anaplasmosis (HGA), formerly recognized as human granulocytic ehrlichiosis (HGE) [5,6]. Infection with A. phagocytophilum causes fever, headache, myalgia, anorexia, and chills [7]. Prior to 1994, only ruminant and equine ehrlichiosis were known to be caused by this organism [1]. A. phagocytophilum is transmitted by Ixodes spp. Cases of HGA correspond to the distribution of Ixodes spp. being identified in New England, the mid-Atlantic region, the upper Midwest, and northern California in the United States, as well as in parts of Europe. A. phagocytophilum is one of the leading causes of ehrlichiosis in the world. Recent serological data suggest that as much as 15%–36% of the population in endemic areas has been infected [8]. Far fewer individuals are diagnosed with a symptomatic infection that varies in severity from fever to death [8]. Half of all symptomatic patients require hospitalization, and 5%–7% require intensive care [8]. Human monocytic ehrlichiosis (HME), caused by E. chaffeensis, was discovered in 1986 [9–11]. HME is a systemic disease indistinguishable from HGA [12]. E. chaffeensis has been most commonly identified in the Lone Star tick (Amblyomma americanum), with white-tailed deer considered to be the major reservoir. Over 500 cases of HME were diagnosed from 1986 to 1997, predominantly in the south-central and southeastern United States [12]. The recognition and increased prevalence of the disease has been proposed to be related to changes in the host-vector ecology [12]. As with all emerging diseases, it is likely outbreaks occurred in the preceding decades. Notably, 1,000 troops training in Texas contracted an unexplained disease with similar symptoms after exposure to the vector from 1942 to 1943 [12]. N. sennetsu is a monocytotropic species that causes sennetsu ehrlichiosis, an infectious mononucleosis-like disease with fever, fatigue, general malaise, and lymphadenopathy [1,13]. Less is known about the distribution of N. sennetsu when compared to Anaplasma and Ehrlichia. However, sequencing of its genome allows for interesting comparisons, since tissue tropism and clinical symptoms are similar but the vector (unknown trematodes) is different. Additionally, in the United States and Canada, domestic animals infected with the closely related N. risticii develop Potomac horse fever, an acute febrile disease accompanied by diarrhea with high morbidity and mortality [14,15]. The related N. helminthoeca causes acute and highly fatal salmon-poisoning disease of domestic and wild canines [14,16]. Along with Wolbachia, these bacteria are members of the Anaplasmataceae family (Figure 1
Together with the Rickettsiaceae, the Anaplasmataceae are members of the order Rickettsiales (Figure 1 Three Rickettsiaceae genomes have been published: Rickettsia prowazekii, R. conorii, and R. typhi [18,20,21]. Four Anaplasmataceae genomes have been published: the insect parasite W. pipientis wMel, the filarial nematode endosymbiont Wolbachia sp. wBm, the bovine pathogen Anaplasma marginale, and the bovine pathogen Ehrlichia ruminantium [19,22–24]. We present here a comparison of the previously completed Rickettsiales genomes to the first complete genomes of three representative Anaplasmataceae human pathogens: A. phagocytophilum, E. chaffeensis, and N. sennetsu. The complete genome sequence of these human pathogens will enhance the opportunities for investigation of virulence factors, pathogenesis, immune modulation, and novel targets for antimicrobial therapy and vaccines. Results/Discussion Genome Anatomy A. phagocytophilum, E. chaffeensis, and N. sennetsu each have a single circular chromosome (Figure S1). Most genomic features are typical of the sequenced Rickettsiales (Table 2). W. pipientis wMel, Ehrlichia spp., and Anaplasma spp., which are most closely related, all have numerous repeats in their genomes. In contrast, N. sennetsu and R. prowazekii have only six repeats in their respective genomes (Table 2). The repetitive nature of the Ehrlichia and Anaplasma genomes is exemplified by the expansion of outer membrane proteins of the OMP-1/P44/Msp2 family (discussed below). In addition numerous other functionally important genes are duplicated including those involved in type IV secretion and vitamin/cofactor biosynthesis.
The origin of replication was not experimentally determined in any of the genomes. As with other Rickettsiales [18], genes typically clustered near the origin (dnaA, gyrA, gyrB, rpmH, dnaN, parA, and parB) were dispersed throughout the genomes. For E. chaffeensis and N. sennetsu, a clear shift in GC-skew occurs near parA and parB (Figure 2
Only three islands of synteny over 10 kb in length are conserved among all the sequenced Anaplasmataceae, and these islands are shared among all the Rickettsiales (Figure 2 Of genes typically clustered near the origin, parA and parB were not identified in A. phagocytophilum. Likewise, parA and parB are truncated in the Wolbachia sp. wBm. In various mutational studies in free-living prokaryotes, the effects of inactivation of parA or parB range from lethality to production of anucleated cells at low copy number [25,26]. Without parA and parB, A. phagocytophilum and the Wolbachia sp. wBm may have random chromosome partitioning, may require an alternate partitioning factor, or may have inefficient chromosome partitioning. Of all the sequenced Anaplasmataceae, only the Anaplasma spp. and Ehrlichia spp. share conserved gene order (synteny) across their chromosome (Figure 2
In addition to the synteny breaks near the Rho termination factors, A. marginale has rearrangements located near the msp2 and msp3 expression loci and their corresponding pseudogenes (Figure 3 Genome Comparisons In order to compare the genomic content of the Rickettsiales to that of other intracellular bacteria, ortholog clusters were delineated for 19 representatives of obligate and facultative intracellular pathogens and endosymbionts (see Materials and Methods). Such comparisons show conservation of 176 ortholog clusters across these intracellular bacteria (Table S1), most of which correspond to housekeeping functions. Eleven ortholog clusters present in all the Rickettsiales distinguish the Rickettsiales from other intracellular bacteria examined (Table S2). These include a type I secretion system ATPase, a pyridine nucleotide-disulfide oxidoreductase family protein, a putative transporter, and type IV secretion system proteins VirB9 and VirB8. Thirteen ortholog clusters composed of 12 conserved hypothetical proteins and a GNAT family acetyltransferase distinguish all the Anaplasmataceae from the Rickettsiales (Table S3). Five genera in the Rickettsiales order have at least one representative sequenced. In order to compare these five genera, the following genomes were compared: R. prowazekii, N. sennetsu, W. pipientis, A. phagocytophilum, and E. chaffeensis. This comparison shows conservation of 423 ortholog clusters (Table S4) generally associated with housekeeping functions. Most genes in the five compared genomes are either conserved among all genomes or unique to a given genome. Indeed, 60% of the two-, three-, and four-way comparisons shared fewer than ten ortholog clusters (Figure 4
In two-way comparisons, the AC (R. prowazekii and W. pipientis) and DE (A. phagocytophilum and E. chaffeensis) intersections contain more than twenty ortholog clusters. Genes shared only by R. prowazekii and W. pipientis include those for cell wall biosynthesis, subunits of cytochrome D ubiquinol oxidase, a biotin transporter, a dinucleoside polyphosphate hydrolase, and an amino acid permease (Table S7). The presence of genes for cell wall biosynthesis in only R. prowazekii and W. pipientis likely reflects differences in the cell surface; A. phagocytophilum, E. chaffeensis, and N. sennetsu do not synthesize peptidoglycan [28]. The peptidoglycan biosynthesis genes are also found in A. marginale, which suggests that if these genes are expressed, A. marginale may have a peptidoglycan layer [23]. Since the peptidoglycan genes are present in A. marginale and W. pipientis but not in the other Anaplasmataceae, these genes have either been horizontally acquired in these organisms or have been lost numerous times in the Anaplasmataceae. Peptidoglycan binding to the Toll-like receptor 2 activates leukocytes. Neither A. marginale nor W. pipientis infects the immune cells of a vertebrate host. The peptidoglycan layer may have been lost to allow the organism to successfully infect vertebrate immune cells. Genes shared only by A. phagocytophilum and E. chaffeensis include those encoding thiamine biosynthetic proteins, a potassium transporter, a peptide deformylase, and an ankyrin repeat protein (Table S8). Thiamine biosynthesis is distinctly absent from N. sennetsu, suggesting a possible trematode niche-specific adaptation. A. phagocytophilum, E. chaffeensis, and N. sennetsu have 462, 312, and 303 open reading frames (ORFs) or paralog clusters that are unique with respect to the five-organism ortholog cluster analysis, respectively. The vast majority of these unique genes encode hypothetical, conserved hypothetical, and conserved domain proteins, as well as uncharacterized membrane proteins and lipoproteins. Other A. phagocytophilum-specific genes include those encoding the P44 outer membrane proteins and the HGE-14 and HGE-2 antigenic proteins (Table S9). E. chaffeensis-specific genes include those for the OMP-1 family of proteins, arginine biosynthesis, a major facilitator family transporter, and a variable-length PCR target protein (Table S10). N. sennetsu-specific genes include those for an F-type ATPase beta subunit, a cyclophilin-type peptidyl-prolyl cis-trans isomerase, a branched-chain amino acid transporter, a sensor histidine kinase, a strain-specific surface antigen, thioredoxin, and the type IV secretion system proteins VirB2 and VirB4 (Table S11). Of the organism-specific genes detected in this five-way comparison, over half were hypothetical proteins, many of which formed genomic islands of hypothetical proteins (Figure 2 Only one ortholog cluster containing conserved hypothetical proteins is shared between the animal pathogens E. ruminantium (Erum1840, ERGA_CDS_01780) and A. marginale (AM279) and are absent from the human pathogens E. chaffeensis, A. phagocytophilum, and N. sennetsu. In addition, a homolog of these proteins is present in the Ehrlichia canis Jake publicly available shotgun sequence. Since A. phagocytophilum and E. chaffeensis are maintained in animal reservoirs, presence of this gene is not associated with animal infection. Instead, loss of this protein could be required to establish infection in humans. These conserved hypothetical proteins have some homology to the eukaryotic patatin family of phospholipases. Patatin has been characterized to have phospholipase A-like activity [29]. Except for N. sennetsu, all of the sequenced pathogenic Anaplasmataceae require an arthropod-vector that feeds on blood (Table 1). Three ortholog clusters, including one for bacterioferritin and two for conserved hypothetical proteins, are absent in all of the tick-, flea-, and louse-borne Rickettsiales, but are present in Wolbachia spp. and N. sennetsu (Table S14). The proteins in these ortholog clusters may be correlated to the lack of a blood-sucking arthropod in the life cycles of these organisms. The tick-borne Anaplasmataceae (Ehrlichia spp. and Anaplasma spp.) are the only Rickettsiales that are not transmitted transovarially in the invertebrate host. One ortholog cluster containing a class II aldolase/adducing domain protein (NSE_0849, RC0678, RP493, RT0479, WD0208) is absent only from Ehrlichia spp. and Anaplasma spp. Lack of this aldolase/adducing domain protein may prevent transovarial transmission in the arthropod vector. Four ortholog clusters of conserved hypothetical proteins are present in all the pathogenic Rickettsiales but none of the endosymbionts. These proteins, which remain to be characterized, may be essential for pathogenesis or survival in the vertebrate host (Table S15). A. phagocytophilum Strain Comparison As an initial effort to use these genome sequences to identify the conserved genomic content of unsequenced members of these species, we conducted microarray-based comparative genome hybridization analyses with two A. phagocytophilum strains. Except for four p44 hypervariable regions (discussed below), the genomic content across all three strains is conserved (ratio < 3). Although A. phagocytophilum and A. marginale have very different complements of unique genes, the genomic content within the strains of A. phagocytophilum is highly conserved. Conservation of the gene content of the strains may explain the similarity of clinical signs of HGA from two geographic regions (New York, Minnesota) and equine ehrlichiosis in California [7]. Free-Living and Obligate-Intracellular α-Proteobacteria In order to understand the differences between these obligate intracellular pathogens and a closely related free-living organism, the number of genes in each role category was compared between representative Anaplasmataceae and Caulobacter crescentus (Table 3). C. crescentus is a closely related and sequenced free-living α-Proteobacteria to the Rickettsiales [30]. The scope of this comparison was limited to only these five α-Proteobacteria, as only these organisms had role categories assigned in an identical manner.
All of the Anaplasmataceae examined have significantly higher percentages of their genomes involved in nucleotide biosynthesis, cofactor and vitamin biosynthesis, and protein synthesis. Enzymes in these biosynthetic pathways are likely to play an important role in interactions with their hosts and intracellular survival, as discussed below. The protein synthesis category includes many essential genes such as those encoding ribosomal proteins, tRNA synthetases, RNA modification enzymes, and translation factors. These genes are essential and cannot be sacrificed as the genome reduces. Therefore, as the genome size decreases, the proportion of genes involved in protein synthesis increases. All of the Anaplasmataceae examined have a significantly lower coding capacity for central intermediary metabolism, transport, and regulatory functions. The decrease in central intermediary metabolism and transport reflects the differences in acquiring nutrients and energy. Since intracellular bacteria are exposed to a relatively restricted complement of nutrients and energy sources, they have evolved to be specialists in acquiring specific compounds from their hosts. Likewise, these intracellular bacteria live in a homeostatic environment and have fewer regulatory genes. ORFs encoding σ70 and σ32 were identified (rpoD and rpoH, respectively), but σ24 and σ54 were not detected (rpoE and rpoN, respectively). Several two-component regulatory systems are retained and may be employed as these bacteria transition between their vertebrate and invertebrate hosts. Despite being identified in Rickettsia spp. [21], stringent response (a global regulatory response) may not be expected in the Anaplasmataceae, since neither RelA nor SpoT proteins were identified. There are several role categories in which only specific organisms have significant differences from, or similarities to, C. crescentus. All the bacteria except E. chaffeensis have a statistically significant decrease in amino acid biosynthesis. The difference between Ehrlichia spp. and the other Anaplasmataceae is due to the presence of lysine and arginine biosynthesis pathways in Ehrlichia spp., as discussed below. A. phagocytophilum has a significant increase in the percentage of genes dedicated to the cell envelope due to expansion of the OMP-1 family in Anaplasma spp. (discussed below). W. pipientis has a significantly higher percentage of its genome involved in mobile and extrachromosomal functions due to the unique presence of phage and transposons in its genome [19]. E. chaffeensis, A. phagocytophilum, and N. sennetsu have a significant decrease in mobile elements, as they have no intact prophage, no transposable elements, and only a few phage core components (HK97-like portal, major capsid, and prohead protease) scattered throughout their genomes. Lastly, A. phagocytophilum and W. pipientis both have an increased number of disrupted reading frames. Based on comparisons of the intracellular and free-living α-Proteobacteria, the only overall theme that emerges is the conservation of housekeeping genes and the shuffling of the genomes resulting in the loss of many operon structures. Pathogenesis Little is known about the genetic determinants required for the Rickettsiales to invade a host and cause disease. Putative pathogenesis genes were identified, including enzymes to neutralize reactive oxygen species, outer membrane proteins, and protein secretion systems. Oxidative stress response. Reactive oxygen species have been implicated in both host defense to infection and host cell injury [31–33]. All of the Rickettsiales contain sodB, an iron superoxide dismutase. This superoxide dismutase may have an important role in pathogenesis since sodB is cotranscribed with components of the type IV secretion system in E. chaffeensis and A. phagocytophilum [34]. Further examination of conserved genes without functional annotation (e.g., conserved hypothetical proteins, conserved domain proteins) shows two other ortholog clusters of proteins that may be involved in response to oxidative stress—a putative heme copper oxidase and a putative flavohemoglobin. In both cases, there is no significant similarity to a protein of known function, but several conserved domains were identified. From a particular combination of domains and conservation of metal/cofactor ligands, a function of response to oxidative stress can be proposed for these proteins [35]. Indeed, ECH_1079, NSE_0121, and APH_1205 each contain the 12 transmembrane segments and six conserved histidine residues consistent with members of the heme-copper oxidase family. Members of this protein family include cytochrome oxidase subunit I, FixN for nitrogen fixation, and NorB for nitric oxide reduction [36]. Each of these organisms is unlikely to be fixing nitrogen and already has a functional subunit I of cytochrome oxidase (ECH_1003, NSE_0622, and APH_1085), so these orthologs may be nitric oxide reductases. Alternatively, there may be another, as yet to be identified, role for this oxidase, which was identified in all the Rickettsiales genomes except the Wolbachia sp. wBm where it is truncated (an ORF that was not annotated but has genomic coordinates from 536343 to 536534). APH_0545, NSE_0661, and ECH_0778 encode proteins with three functional motifs similar to flavohemoglobins—a heme binding site, an FAD binding domain, and an NAD binding domain. The biological function of the Escherichia coli flavohemoglobin has not been elucidated, but it has been shown to be an efficient alkylhydroperoxide reductase [37] and a nitric oxide reductase [38]. This putative flavohemoglobin is conserved among the Anaplasmataceae, but Wolbachia spp. are missing the NAD oxidoreductase domain, and R. prowazekii is missing the heme ligands. Although the speculation of a role for these genes in pathogenicity is intriguing, the precise function of each of these proteins will need to be elucidated experimentally. The OMP-1/MSP2/P44 protein superfamily. The Anaplasmataceae all have a diverse complement of outer membrane proteins. Many of these outer membrane proteins (OMPs) are members of Pfam PF01617 [39] and constitute the OMP-1/MSP2/P44 family. Anaplasma, Ehrlichia, and Wolbachia have each undergone variable levels of expansion of their omp-1/msp2 gene families (Figure S2). The N. sennetsu genome has only one uncharacterized protein from this family (NSE_0875). W. pipientis wMel and the Wolbachia sp. wBm have the smallest expansion with three wsp genes scattered throughout each genome. The largest expansion of this family is in Ehrlichia spp. and Anaplasma spp. These organisms cannot be transovarially inherited in their arthropod hosts. Instead, ticks acquire Ehrlichia or Anaplasma by feeding on an infected vertebrate reservoir animal. The expansion of this family may allow persistence in the vertebrate reservoir by providing antigenic variation, thus allowing for effective tick transmission. E. chaffeensis, E. canis, and E. ruminantium have 17–22 paralogous tandemly arranged genes from this family that are flanked by a transcription regulator (tr1) and a preprotein translocase (secA) [40–42]. These genes all have signal peptides and are likely to be secreted across the cytoplasmic membrane by SecA [42]. They encode immunodominant major outer membrane proteins that are differentially expressed in ticks and experimentally infected animals [43]. A. marginale St. Maries is reported to have 56 genes that have been placed into this superfamily, including eight msp2, eight msp3, one msp4, three opag, 15 omp-1, 12 orfX, seven orfY, and two msp3 remnants [23]. These genes are scattered throughout the genome with a bias in location toward the origin of replication. MSP2 and MSP3 are the immunodominant proteins [44]. The msp2 and msp3 gene subsets each include one full-length expression locus and seven reserve/silent sequences that are thought to recombine into the expression locus to generate antigenic variation [23]. The A. phagocytophilum genome has three omp-1, one msp2, two msp2 homologs, one msp4, and 113 p44 loci belonging to the OMP-1/MSP2/P44 superfamily. Although both Anaplasma spp. msp2 genes are members of PF01617 and the OMP1/MSP2/P44 superfamily, the A. marginale msp2 gene is distinct from the A. phagocytophilum msp2 gene. In addition, the previously identified omp-1N is not a member of this Pfam, but is homologous to E. chaffeensis omp-1N and the msp2 operon-associated gene 3 of A. marginale [45]. The largest expansion of this family is that of p44 genes in A. phagocytophilum. Only 36 copies of p44 are in this Pfam, but many smaller regions were identified, resulting in a total of 113 annotated p44 loci (Table S16). The p44s consist of a central hypervariable region of approximately 280 bp containing a signature of four conserved amino acid regions (C, C, WP, A) and conserved flanking sequences longer than 50 bp. Diverse p44 paralogs (p44–1 to p44–65) are expressed in mammals and ticks and confer antigenic environmental adaptation, especially during tick transmission [46–49]. The genomic loci of all 65 previously described p44 genes were determined in the present study (Figure S3). Twenty-three novel p44 genes (p44-66 to p44–88) were identified by genome sequencing, but have not yet been experimentally identified as being expressed. The p44s were annotated as full-length, silent/reserve, truncated, and fragments (Figure 5
In addition to the full-length and silent/reserve p44 genes, 21 5′ and 3′ fragments and six truncations of p44 genes larger than 60 nucleotides have been identified in the genome. Truncations include portions of a hypervariable region; fragments did not. The p44s annotated as truncated and fragments do not contain both conserved regions flanking the hypervariable region. These p44s are not expected to recombine through the homologous recombination model deduced by previous analyses of recombined p44s [49–52]. Microarray-based comparative genomic hybridization reveals that expansion of the p44 family is a common feature in A. phagocytophilum strains. All but four of the p44 unique hypervariable sequences used as targets on the microarray are present in the human isolate A. phagocytophilum MN and the horse isolate A. phagocytophilum California MRK (Figure S3; Table 4). The p44-12 and p44-9 unique regions are either absent or divergent only in strain MN. The p44-4 and p44-1 unique regions are absent or divergent in strains MN and MRK. This confirms previous results demonstrating that the p44–1 unique region is absent/divergent in MN and MRK [52].
Other important outer membrane proteins. N. sennetsu has a single p51 gene (NSE_0242) encoding its immunodominant P51 major outer membrane protein [14]. The p51 gene is highly conserved among N. risticii, N. sennetsu, and the Stellantchasmus falcatus fluke agent, but not in N. helminthoeca, the agent causing an acute, highly fatal salmon-poisoning disease of domestic and wild canines [14]. Although a full-length, highly conserved homolog for P51 was not found in the Rickettsiales genome sequences, P51 was placed in an ortholog cluster of genes conserved among all the Rickettsiales due to short regions of similarity, particularly in a C-terminal region that may include a secretion peptide motif. Other outer membrane proteins have been reported in A. marginale, including msp5, msp1a, and msp1b. The msp5 gene (a SCO1/SenC family protein) is found in all the Rickettsiales, whereas msp1a and msp1b are unique to A. marginale. Protein secretion systems. All of the strains sequenced here contain both a Sec-dependent and Sec-independent protein export pathway for secretion of proteins across the inner membrane. The Sec-independent pathway (Tat pathway) has been implicated in the transport of phospholipases in Pseudomonas aeruginosa [55]. All of the strains sequenced here also contain two components of a putative type I secretion system, potentially for transporting toxins or proteases carrying a C-terminal secretion signal. All of the Rickettsiales have a type IVa secretion system that uses a complex of transmembrane proteins and a pilus to deliver effector macromolecules from prokaryotic to eukaryotic cells. The reference Type IVa secretion system is that of Agrobacterium tumefaciens, which contains 11 genes in the virB locus and one gene in the virD locus. Several components of the A. tumefaciens type IVa secretion system are conserved in A. phagocytophilum, E. chaffeensis, and N. sennetsu. Like R. prowazekii and W. pipientis, the three organisms sequenced here are lacking virB1, virB5, and virB7. All but N. sennetsu lack virB2. The virB3, virB4, and virB6 homologs are contiguous at one locus (Figure S4). Neighboring this locus in all of these organisms are three or four virB6 homologs. Contiguous at a second locus are virB8, virB9, virB10, virB11, and virD4. The type IV secretion system is one of the few sets of genes syntenic between all of the Rickettsiales sequenced, suggesting that tight coordination of expression of these genes is critical. In A. tumefaciens, translocated type IV effector proteins have the consensus sequence R-X7-R-X-R-X-R-X-Xn, where lysine can substitute for arginine with no noticeable effect [56]. In addition, effector molecules are often localized to a region of the chromosome near the type IV secretion apparatus. Examination of the regions around the type IV operons in A. phagocytophilum revealed numerous genes encoding HGE-14, which contain C-terminal sequences similar, but not identical, to this motif (Table S17), suggesting that it may be an excreted effector molecule. Subsequent searches of the Anaplasmataceae genomes with motifs like that found in HGE-14 did not reveal other potential effector molecules. Metabolism The metabolic potentials of A. phagocytophilum, E. chaffeensis, and N. sennetsu were compared to that of R. prowazekii and W. pipientis [18,19]. Overall, the Anaplasmataceae have very similar metabolic pathways but are quite distinct from those of R. prowazekii (Figure 6
Nucleotide and cofactor biosynthesis. E. chaffeensis, A. phagocytophilum, N. sennetsu, and W. pipientis have the ability to synthesize all nucleotides. This differs from R. prowazekii, which cannot make purines or pyrimidines, and therefore must rely on nucleotide translocases and interconversion of the bases to obtain the full complement of nucleotides [18]. E. chaffeensis, A. phagocytophilum, and N. sennetsu are able to synthesize most vitamins and cofactors. In contrast to the other Anaplasmataceae, W. pipientis has lost some of its ability to synthesize cofactors, and it has completely lost the biosynthetic pathways for biotin, thiamine, and NAD. In addition, it may be in the process of losing the ability to synthesize folate. R. prowazekii has also lost the ability to synthesize these cofactors as well as FAD, pantothenate, and pyridoxine-phosphate. Biotin is one of the essential cofactors only synthesized by the vertebrate-infecting Anaplasmataceae. In most organisms, biotin is required for many carboxylation reactions, but is not synthesized by many multicellular eukaryotes. RT-PCR analysis showed that all four genes in the biotin biosynthesis pathway (BioA/B/D/F) were expressed by E. chaffeensis and A. phagocytophilum in THP-1 and HL-60 cells, respectively, at both 2 d and 3 d post infection (Figure S5). The presence of nucleotide, vitamin, and cofactor biosynthetic pathway in E. chaffeensis, A. phagocytophilum, and N. sennetsu suggests that they do not need to compete with the host cell for, and may even supply host cells with, essential vitamins and nucleotides. It has been previously proposed that Wigglesworthia glossinidia supplies its host with vitamins that are rare in the blood meal of its arthropod host (tsetse fly) [57]. Interestingly, Ehrlichia spp. and Anaplasma spp., the two tick-borne intracellular pathogens sequenced, both have a complement of pathways for cofactor and amino acid biosynthesis similar to W. glossinidia (Table 5). This raises the possibility that these pathogens may currently be, or historically have been, able to provide a benefit to their tick hosts by providing necessary cofactors.
Amino acid biosynthesis. The Rickettsiales have a very limited ability to synthesize amino acids and must rely on transporting them from the host (Figure 6 Glycolysis, tricarboxylic acid cycle, pentose phosphate, and respiration. A complete pyruvate dehydrogenase, tricarboxylic acid cycle, F0F1-ATPase, and electron transport chain were found in all of the organisms. All five organisms are likely to use host-derived carboxylates and amino acids, but none of these organisms can obtain carbon or energy from fatty acids or actively carry out glycolysis. The glycolysis enzymes present are limited to those that produce glyceraldehyde-3-phosphate and dihydroxyacetone phosphate from phosphoenolpyruvate (Figure 6 Evolution and DNA Repair A genome-scale phylogenetic analysis using a concatenated alignment of core proteins is consistent with rRNA studies and current taxonomic assignments. This indicates that Anaplasma and Ehrlichia are sister genera that share a common ancestor with Wolbachia (Figure 1 The branch lengths on the whole genome tree can be used to get an indication of the relative rates of evolution of these organisms. In general, the branch lengths for these intracellular organisms are longer than those of their free-living relatives. This may be due to either differences in DNA repair or population genetic and selection-related force. For example, many intracellular organisms go through more stringent population bottlenecks, which in turn increase the amount of genetic drift and possibly the rate of accumulation of deleterious mutations. Analysis of the genome of W. pipientis wMel revealed that it had a longer branch length than the closely related Rickettsia; the Rickettsia have higher rates of evolution than free-living organisms [19]. Wu et al. [19] ascribed this increase to features of Wolbachia biology. However, there appears to be a general increase in the rate for all of the Anaplasmataceae (Figure 1 Examination of the putative DNA-repair capabilities of the different species does not reveal any significant differences between the Anaplasmataceae and the Rickettsia spp. (Table S18). Interestingly, within the Anaplasmataceae, N. sennetsu appears to have the longest branch length and the most limited suite of DNA repair genes within the group. For example, N. sennetsu is missing various glycosylases and exonucleases that contribute to repair, including uvrABC, which is involved in nucleotide excision repair. It is possible that the faster rate of evolution in this organism is related to the absence of some of these repair pathways. The absence of uvrABC in N. sennetsu and the absence of uvrBC in the Ehrlichia spp. suggest that these species do not have nucleotide excision repair (NER). NER is used by other organisms, including bacteria, archaea, and eukaryotes, as a general repair process to remove sections of DNA with gross abnormalities. One important role of NER is in the repair of UV-induced DNA damage, and defects in NER in other species lead to great increases in UV sensitivity. It appears that Neorickettsia has compensated for this by acquiring a gene homologous to DNA photolyases, an alternative mechanism for repairing UV damage. The Neorickettsia photolyase is not particularly closely related to known photolyases from α-Proteobacteria but is instead most closely related to a photolyase from Coxiella burnetii, a γ-Proteobacteria. The Ehrlichia spp., however, do not encode a photolyase homolog, and thus these species may be highly UV-sensitive. Conclusions The dual existence of members of Anaplasma spp. and Ehrlichia spp. as invertebrate symbionts or commensals and effective human and animal pathogen requires flexibility, a fact reflected in the genome. Both organisms display an expansive inventory of paralogous genes encoding diverse functions that promote survival and success in different environments when compared to Neorickettsia spp. and Wolbachia spp., which do not require a mammalian host. This capacity is evident from the large repertoire of outer membrane proteins, and partial duplication of some of the virulence determinants (e.g., components of the type IV protein secretion system). The large number of paralogous genes encoding immunodominant outer membrane proteins in Anaplasma spp. and Ehrlichia spp. has important implications for the study of pathogenesis and in the development of vaccination strategies. Adaptability in the human host may underlie significant disease manifestations. Genomic-level characterization of the full complement of variable antigens will facilitate the future development of more specific and sensitive diagnostic targets. In light of the growing recognition of the increased global burden of ehrlichiosis, development of such diagnostic targets will impact public health. Between pairwise comparisons of different species within a single genus, there are hundreds of genes that are not shared. Often these gene differences are immunodominant outer membrane proteins, but the vast majority are genes that are not functionally characterized in any organism. Some are likely to be involved in zoonosis or specific disease characteristics. For instance, A. phagocytophilum is the only sequenced Rickettsiale that infects neutrophils. Therefore, some of the A. phagocytophilum-unique genes (e.g., genes encoding P44 and HGE-14) may be involved in neutrophil invasion. Many pathogens are obligate intracellular bacteria. But since they are difficult or impossible to culture and tools for genetic manipulation are limited, they are less well characterized than the facultative intracellular bacteria or extracellular pathogens. The analysis of the genome sequences provides critical insights into the biology of these intracellular pathogens and will facilitate manipulation of the emerging human ehrlichiosis agents and leukocytotropic pathogens. Materials and Methods Intracellular bacteria purification and DNA preparation. Organisms (infecting ~1 × 109 host cells; 50–100 175-cm2 flasks) were cultured in synchrony in respective host cells (E. chaffeensis in DH82 cells, A. phagocytophilum in HL-60 cells, and N. sennetsu in P388D1 cells). Bacterial cells were liberated from the infected host cells using Dounce homogenization, differential centrifugation, and Percoll density gradient centrifugation [60]. Any specimens with host nuclei contamination were excluded. From these isolated bacteria, phenol extraction was used to purify DNA that was minimally fragmented and free of host-cell DNA. Levels of host DNA contamination were verified to be less than 0.001% by PCR using host G3PDH-specific primers. This method was highly successful, with only 14 sequencing reads identified as being of human origin from a total of over 57,000 good sequencing reads. Sequencing and annotation. The complete genome sequences were determined using the whole-genome shotgun sequencing approach [61], sequences were assembled into contigs using the Celera Assembler [62], and all gaps were closed [63]. ORFs from each genome were predicted and annotated using a suite of automated tools that combine Glimmer gene prediction [64,65], ORF and non-ORF feature identification (e.g., protein motifs), and assignment of database matches and functional role categories to genes [63]. Frameshifts and point mutations were detected and corrected where appropriate; those remaining were annotated as “authentic frameshift” or “authentic point mutation.” Repeats were identified using RepeatFinder [66,67] and were manually curated. The complete genome sequences for A. phagocytophilum HZ, E. chaffeensis Arkansas, and N. sennetsu Miyayama have been deposited in GenBank. Annotation of the p44 genes. Full-length p44s were defined as having ORFs greater than 1,000 bp with conserved start codon and stop codons. For shorter silent/reserve p44s, the ORFs were initially identified by locating highly conserved 5′ and 3′ sequences and signature sequences within the hypervariable region. Since these silent/reserve p44s lack a start and stop codon, the 5′ and 3′ ends were annotated on the basis of conserved genome features found in full-length p44 genes [50,68]. The annotated p44 fragments are at least 60 nucleotides in length, have either 5′ or 3′ conserved sequences, and may contain a partial hypervariable region (Figure 5 Genome comparisons. Ortholog clusters were delineated for R. prowazekii Madrid E [18], R. typhi Wilmington [21], R. conorii Malish 7 [20], N. sennetsu Miyayama, Wolbachia sp. wBm [22], W. pipientis wMel [19], E. chaffeensis Arkansas, E. ruminantium Gardel (GenBank CR925677.1), E. ruminantium Welgevonden [24], A. marginale St. Maries [23], A. phagocytophilum HZ, Brucella suis 1330 [69], Bartonella henselae Houston-1 [70], Coxiella burnetii RSA 493 [71], Tropheryma whipplei Twist [72], Blochmannia floridanus [73], Buchnera sp. APS [74], Chlamydia pneumoniae AR39 [75], and W. glossinidia brevipalpis [76]. For Wolbachia sp. wBm, the ORFs used in these comparisons were uncurated ORFs predicted and annotated using a suite of automated tools that combine Glimmer gene prediction [64,65], ORF and non-ORF feature identification (e.g., protein motifs), and assignment of database matches and functional role categories to genes [63]. Upon release of the annotated genome [22], these uncurated ORFs were paired with the corresponding curated ones where possible, with exceptions noted in the text. Paralog clusters within each of the genomes were identified using the Jaccard algorithm with the following parameters: 80% or greater identity and Jaccard coefficient 0.6 or higher [77, Text S1]. Members of paralog clusters were then organized into ortholog clusters by allowing any member of a paralog cluster to contribute to the reciprocal best matches used to construct the ortholog clusters. The conservation of ortholog clusters across the various genomes analyzed was determined using Sybil, a web-based software package for comparative genomics developed at TIGR (http://sybil.sourceforge.net). The database of these clusters and corresponding tools can be accessed through TIGR (http://www.tigr.org/sybil/rcd). Metabolic pathways and transporters were compared across genomes using (1) these calculated ortholog clusters, (2) Genome Properties [78], (3) TransportDB [79], and (4) Biocyc [80]. Significant differences in the role category composition was determined using χ2 calculated using the Yates continuity correction. A p-value less than 0.01 was considered significant. GC-skew and origin prediction. The GC-skew was calculated as (C − G)/(C + G) in windows of 1,000 bp along the chromosome [81]. The origin of replication was not experimentally determined in any of the genomes. For E. chaffeensis and N. sennetsu, a clear shift in GC-skew occurs near parA and parB. Therefore basepair 1 was set in the intergenic region between the two genes. In A. phagocytophilum, a GC-skew transition occurs near polA. Therefore, basepair 1 was set in the intergenic region near polA. Atypical nucleotide composition. Regions of atypical nucleotide composition were identified by the χ2 analysis: the distribution of all 64 trinucleotides was computed for the complete genome in all six reading frames, followed by the trinucleotide distribution in 5,000-bp windows overlapping by 500 bp. For each window, the χ2 statistic was computed based on the difference between the trinucleotide content in that window and that of the whole genome. Peaks indicate regions of atypical nucleotide composition. Genome tree construction. Protein sequences of 31 housekeeping genes (frr, infC, nusA, pgk, pyrG, rplA, rplB, rplC, rplD, rplE, rplF, rplK, rplL, rplM, rplN, rplP, rplS, rplT, rpmA, rpoB, rpsB, rpsC, rpsE, rpsI, rpsJ, rpsK, rpsM, rpsS, smpB, and tsf) from complete α-Proteobacteria genomes were aligned to predefined HMM models and ambiguous regions were autotrimmed according to an embedded mask. Concatenated alignments were then used to build a maximum likelihood tree with bootstrapping using PHYML [82]. The γ-Proteobacteria E. coli and the β-Proteobacteria Neisseria meningitidis were used as outgroups to root the tree. Array construction and hybridizations. Oligonucleotides (70-mer) were designed from the unique ORFs of each of the three genomes. The oligonucleotides (Illumina, San Diego, California, United States) were diluted to 25 μM in DMSO and spotted in quadruplicate onto UltraGap slides (Corning, Acton, Massachusetts, United States). Cy3 and Cy5 probes were synthesized from genomic DNA as previously described [83]. In order to obtain enough DNA for microarray analysis, small amounts of DNA were prepared in the manner described above for genome sequencing. This DNA was then quantitatively amplified using GenomiPhi (Amersham, Piscataway, New Jersey, United States). Appropriately labeled query and reference probes were hybridized overnight, washed, and scanned using an Axon GenePix 4000B scanner (Axon Instruments, Union City, California, United States). The corresponding images were analyzed with TIGR Spotfinder [84]. Log mode centering was used to normalize the data alleviating the bias of expression microarray normalization methods, which expect a normal distribution of data. Briefly, a Perl script was designed to construct the histogram of the log2 of the ratio and adjust the histogram mode to zero. The data presented are the geometric means of the normalized ratios from at least two slides with different reference Cy dyes and with oligonucleotides printed in quadruplicate. Transcript analysis of biotin biosynthetic genes. Total RNA was extracted from E. chaffeensis or A. phagocytophilum-infected THP-1 or HL-60 cells at 2 d or 3 d postinfection using RNeasy (Qiagen, Valencia, California, United States). RNA was DNase I treated (Invitrogen, Carlsbad, California, United States) in the presence of 40 U of RNaseOUT (Invitrogen) for 15 min at room temperature, followed by inactivation at 65 °C in the presence of 2.5 mM EDTA for 10 min. For cDNA synthesis, total RNA (0.5 μg) was reverse-transcribed at 42 °C for 1 h in 50 mM Tris-HCl (pH 8.3), 75 mM KCl, 3 mM MgCl2, 0.5 mM of each dNTP, 1 U of RNase inhibitor (Invitrogen), 1.5 μM random hexamers (Invitrogen), and 10 U of Superscript II reverse transcriptase (Invitrogen). The reaction was terminated by heat inactivation at 70 °C for 15 min. To ensure the absence of DNA contamination in the RNA preparations, the assay was duplicated without reverse transcriptase. The subsequent amplification was conducted with standard conditions for 25 cycles of 95 °C for 45 s, 54 °C for 45 s, and 72 °C for 1 min and with the PCR primer pair (Table S19). Figure S1: Linear Representation of E. chaffeensis Arkansas, A. phagocytophilum HZ, and N. sennetsu Miyayama Genomes ORFs are oriented along the molecule and are color-coded by role category: violet, amino acid biosynthesis; light blue, biosynthesis of cofactors, prosthetic groups, and carriers; light green, cell envelope; red, cellular processes; brown, central intermediary metabolism; yellow, DNA metabolism; light gray, energy metabolism; black, mobile/extrachromosomal functions and truncated ORFs; magenta, fatty acid and phospholipid metabolism; pink, protein synthesis and fate; orange, purines, pyrimidines, nucleosides, and nucleotides; olive, regulatory functions and signal transduction; dark green, transcription; teal, transport and binding proteins; gray, unknown function; crosshatched, conserved hypothetical proteins; white, hypothetical proteins; salmon, other categories. The tRNA genes and rRNA genes are represented by their secondary structure. Repeats are shown by dual-arrowed line segments; paralogs are represented by arrows. (10.4 MB PDF) Click here for additional data file.(10M, pdf) Figure S2: Phylogenetic Tree of OMP1 Proteins The protein sequences of all the members of PFAM01617 were aligned and a phylogenetic tree inferred. The divergence of the OMP1/MSP2/P44 proteins in this superfamily did not permit robust inferences about the evolution of these proteins, but allowed classification of the proteins into superfamilies as reflected in their annotation. Particular families within this superfamily are highlighted, including the P44 proteins (pink), the OMP-1s (blue), and the Wsp (yellow). (876 KB PDF) Click here for additional data file.(876K, pdf) Figure S3: Distribution of the p44 Genes in A. phagocytophilum Strains Rims 1 and 2: predicted coding regions on the plus and minus strands, respectively, color-coded by role categories. Rim 3: atypical nucleotide composition. Rim 4: distribution of p44 silent genes (green), expression locus (cyan), full-length genes (magenta), fragments (brown), and truncations (blue). Rims 5 and 6: microarray-based comparative genome hybridization results for A. phagocytophilum strain HZ against strains MRK and MN. The ratios [(HZ normalized intensity)/(query normalized intensity)] were divided into three categories: ratio > 10 (red; absent); ratio 3–10 (blue; absent/divergent); and ratio < 3 (not plotted; present). (1.0 MB PDF) Click here for additional data file.(1018K, pdf) Figure S4: Type IV Secretion Systems in Rickettsiales Genes encoding the type IV secretion system components can be found at two distinct regions of the Rickettsiales genome. At the larger of these regions, virB3, virB4, and virB6 show a typical arrangement. These are followed by a series of genes in the virB6 family that have been shown to be cotranscribed in W. pipientis wMel. Each of these regions is presented with ortholog clusters (see Materials and Methods) and color coded: cyan, virB3; orange, virB4; green, virB6; and purple, a virB6 family of genes. Orthologs conserved in location are connected with gray bars. The virB3 gene is not always annotated, due to its small size, but it is present in all Rickettsiales genomes examined. (616 KB PDF) Click here for additional data file.(616K, pdf) Figure S5: Transcript Analysis of Biotin Biosynthetic Genes DNase-treated total RNA was reverse-transcribed and subsequently PCR amplified using primers specific to each biotin biosynthesis gene. RT-PCR analysis showed that all four genes in the biotin biosynthesis pathway (BioA/B/D/F) were expressed by E. chaffeensis and A. phagocytophilum in THP-1 and HL-60 cells, respectively, at 2 d (unpublished data) and 3 d postinfection. (61 KB JPG) Click here for additional data file.(62K, jpg) Table S1: Ortholog Clusters Conserved across All Representative Obligate and Facultative Intracellular Pathogens and Endosymbionts Presented in a Tab-Delimited Format (640 KB DOC) Click here for additional data file.(641K, doc) Table S2: Ortholog Clusters Present in All the Rickettsiales but Not in Any Other Intracellular Bacterium Examined (49 KB DOC) Click here for additional data file.(50K, doc) Table S3: Ortholog Clusters Present in All Anaplasmataceae but Not the Rickettsiales or Other Intracellular Bacterium Examined (47 KB DOC) Click here for additional data file.(48K, doc) Table S4: Ortholog Clusters Present in All of Five Representatives of the Genera in the Rickettsiales That Have at Least One Representative Sequenced The following genomes were compared: Rickettsia prowazekii, Neorickettsia sennetsu, Wolbachia pipientis, Anaplasma phagocytophilum, and Ehrlichia chaffeensis. (424 KB DOC) Click here for additional data file.(425K, doc) Table S5: Ortholog Clusters Present in Neorickettsia sennetsu, Anaplasma phagocytophilum, and Ehrlichia chaffeensis, but Not in Wolbachia pipientis and Rickettsia prowazekii (44 KB DOC) Click here for additional data file.(45K, doc) Table S6: Ortholog Clusters Present in Wolbachia pipientis, Anaplasma phagocytophilum, and Ehrlichia chaffeensis, but Not in Neorickettsia sennetsu and Rickettsia prowazekii (54 KB DOC) Click here for additional data file.(54K, doc) Table S7: Ortholog Clusters Present in Rickettsia prowazekii and Wolbachia pipientis, but Not in Neorickettsia sennetsu, Anaplasma phagocytophilum, or Ehrlichia chaffeensis (41 KB DOC) Click here for additional data file.(41K, doc) Table S8: Ortholog Clusters Present in Anaplasma phagocytophilum and Ehrlichia chaffeensis, but Not in Rickettsia prowazekii, Wolbachia pipientis, or Neorickettsia sennetsu (46 KB DOC) Click here for additional data file.(46K, doc) Table S9: Individual Genes (Based on Ortholog Cluster Analysis) Present in Anaplasma phagocytophilum, but Not in Ehrlichia chaffeensis, Rickettsia prowazekii, Wolbachia pipientis, or Neorickettsia sennetsu (394 KB DOC) Click here for additional data file.(395K, doc) Table S10: Individual Genes (Based on Ortholog Cluster Analysis) Present in Ehrlichia chaffeensis but Not in Anaplasma phagocytophilum, Rickettsia prowazekii, Wolbachia pipientis, or Neorickettsia sennetsu (244 KB DOC) Click here for additional data file.(245K, doc) Table S11: Individual Genes (Based on Ortholog Cluster Analysis) Present in Neorickettsia sennetsu but Not in Anaplasma phagocytophilum, Rickettsia prowazekii, Wolbachia pipientis, or Ehrlichia chaffeensis (223 KB DOC) Click here for additional data file.(223K, doc) Table S12: Ortholog Clusters Present Only in Anaplasma phagocytophilum and Anaplasma marginale (57 KB DOC) Click here for additional data file.(58K, doc) Table S13: Ortholog Clusters Present Only in Ehrlichia chaffeensis, Ehrlichia ruminantium Welgevonden, and Ehrlichia ruminantium Gardel (82 KB DOC) Click here for additional data file.(82K, doc) Table S14: Ortholog Clusters That Are Absent in All of the Tick-, Flea-, and Louse-Borne Rickettsiales, but Are Present in Wolbachia spp. and Neorickettsia sennetsu (27 KB DOC) Click here for additional data file.(27K, doc) Table S15: Ortholog Clusters Present in All the Pathogenic Rickettsiales but None of the Endosymbionts (30 KB DOC) Click here for additional data file.(31K, doc) Table S16: Anaplasma phagocytophilum p44 Genes (246 KB DOC) Click here for additional data file.(246K, doc) Table S17: Putative Anaplasma phagocytophilum Type IV Effector Motifs in HGE-14 Proteins (27 KB DOC) Click here for additional data file.(27K, doc) Table S19: Sequences of Oligonucleotides Used for RT-PCR (31 KB DOC) Click here for additional data file.(32K, doc) Text S1: PDF copy of Jaccard et al. reference (4578 KB PDF) Click here for additional data file.(4.4M, pdf) Accession Numbers The GenBank (http://www.ncbi.nlm.nih.gov) accession number for the Ehrlichia canis Jake shotgun sequence is ZP_00210380; the complete GenBank genome sequences for A. phagocytophilum HZ, E. chaffeensis Arkansas, and N. sennetsu Miyayama are CP000235, CP000236, and CP00237, respectively. The ArrayExpress database (http://www.ebi.ac.uk/arrayexpress) accession numbers for the microarray slide type and study are E-TIGR-125 and A-TIGR-21. Acknowledgments We acknowledge Jessie Goodman, University of Minnesota, for providing the stock culture of A. phagocytophilum MN; Robert F. Massung at the Centers for Disease Control and Prevention, Atlanta for providing the stock culture of A. phagocytophilum MRK; Derrick Fouts for examination of the genomes for prophage elements; Robert DeBoy for examination of the genome for transposons; Leka Papazisi for assistance with statistical analysis; Karen Nelson, Ian Paulsen, and Emmanuel Mongodin for helpful discussions; Chun-Ha Wan for assistance depositing data in ArrayExpress; Robert Munson at Children's Hospital, Ohio State University for assistance with the grant proposal; David Dyer at University of Oklahoma for initial sequencing efforts; and our reviewers for their helpful comments and suggestions. Abbreviations
Footnotes Author contributions. H. Tettelin coordinated genome sequencing. Y. Rikihisa performed general project coordination and strategy. J. C. Dunning Hotopp, M. Lin, S. V. Angiuoli, J. Crabtree, J. Sundaram, Y. Rikihisa, and H. Tettelin performed sequence analysis and comparative genomics. C. Zhang, N. Ohashi, H. Niu, N. Zhi, and M. Lin cultured bacteria, purified bacteria, and purified DNA. R. Madupu, L. M. Brinkac, Q. Lin, R. J. Dodson, M. J. Rosovitz, W. Nelson, S. C. Daugherty, A. S. Durkin, M. Gwinn, D. H. Haft, J. D. Selengut, S. A. Sullivan, L. Zhou, and O. White annotated the genomes. S. V. Angiuoli, J. Crabtree, J. Sundaram, M. Lin, T. Davidsen, N. Zafar, and O. White managed data. T. R. Utterback, S. Smith, M. Lewis, H. Khouri, F. Benahmed, H. Forberger, R Halpin, S. Mulligan, and J. Robinson sequenced and finished the genomes. J. Eisen, R. Seshadri, Q. Ren, and M. Wu performed specific analyses. M. Lin and J. C. Dunning Hotopp performed RT-PCR and CGH experiments, respectively. J. C. Dunning Hotopp, Y. Rikihisa, M. Lin, and H. Tettelin wrote the paper. Funding. This project was supported by National Institutes of Health grant R01 AI47885 to YR. In addition, JCDH was partially supported by National Science Foundation grant EF-0328363. Competing interests. The authors have declared that no competing interests exist. Citation: Dunning Hotopp JC, Lin M, Madupu R, Crabtree J, Angiuoli SV, et al. (2006) Comparative genomics of emerging human ehrlichiosis agents. PLoS Genet 2(2): e21. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||||||||||
Clin Microbiol Rev. 1991 Jul; 4(3):286-308.
[Clin Microbiol Rev. 1991]Int J Syst Evol Microbiol. 2001 Nov; 51(Pt 6):2145-65.
[Int J Syst Evol Microbiol. 2001]Proc Natl Acad Sci U S A. 1996 Jun 11; 93(12):6209-14.
[Proc Natl Acad Sci U S A. 1996]J Clin Microbiol. 1994 Mar; 32(3):589-95.
[J Clin Microbiol. 1994]N Engl J Med. 1996 Jan 25; 334(4):209-15.
[N Engl J Med. 1996]Annu Rev Med. 1998; 49():201-13.
[Annu Rev Med. 1998]Clin Microbiol Rev. 1991 Jul; 4(3):286-308.
[Clin Microbiol Rev. 1991]Emerg Infect Dis. 2005 Dec; 11(12):1828-34.
[Emerg Infect Dis. 2005]J Clin Microbiol. 1991 Dec; 29(12):2838-42.
[J Clin Microbiol. 1991]N Engl J Med. 1987 Apr 2; 316(14):853-6.
[N Engl J Med. 1987]Clin Microbiol Rev. 2003 Jan; 16(1):37-64.
[Clin Microbiol Rev. 2003]Clin Microbiol Rev. 1991 Jul; 4(3):286-308.
[Clin Microbiol Rev. 1991]Kansenshogaku Zasshi. 1987 Oct; 61(10):1166-72.
[Kansenshogaku Zasshi. 1987]J Clin Microbiol. 2004 Aug; 42(8):3823-6.
[J Clin Microbiol. 2004]Environ Microbiol. 2005 Feb; 7(2):203-12.
[Environ Microbiol. 2005]Int J Syst Evol Microbiol. 2001 Nov; 51(Pt 6):2145-65.
[Int J Syst Evol Microbiol. 2001]Int J Syst Evol Microbiol. 2001 Nov; 51(Pt 6):2145-65.
[Int J Syst Evol Microbiol. 2001]Trends Microbiol. 1998 Jul; 6(7):263-8.
[Trends Microbiol. 1998]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]PLoS Biol. 2004 Mar; 2(3):E69.
[PLoS Biol. 2004]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]Science. 2001 Sep 14; 293(5537):2093-8.
[Science. 2001]J Bacteriol. 2004 Sep; 186(17):5842-55.
[J Bacteriol. 2004]PLoS Biol. 2004 Mar; 2(3):E69.
[PLoS Biol. 2004]PLoS Biol. 2005 Apr; 3(4):e121.
[PLoS Biol. 2005]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]Proc Natl Acad Sci U S A. 2005 Jan 18; 102(3):838-43.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2005 Jan 18; 102(3):844-9.
[Proc Natl Acad Sci U S A. 2005]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]Annu Rev Microbiol. 2002; 56():567-97.
[Annu Rev Microbiol. 2002]Microbiology. 2002 Feb; 148(Pt 2):537-48.
[Microbiology. 2002]Genome Biol. 2000; 1(6):RESEARCH0011.
[Genome Biol. 2000]Infect Immun. 2003 Sep; 71(9):5324-31.
[Infect Immun. 2003]Proc Natl Acad Sci U S A. 2005 Jan 18; 102(3):844-9.
[Proc Natl Acad Sci U S A. 2005]Eur J Biochem. 2001 Oct; 268(19):5037-44.
[Eur J Biochem. 2001]Annu Rev Med. 1998; 49():201-13.
[Annu Rev Med. 1998]Proc Natl Acad Sci U S A. 2001 Mar 27; 98(7):4136-41.
[Proc Natl Acad Sci U S A. 2001]J Bacteriol. 2004 Sep; 186(17):5842-55.
[J Bacteriol. 2004]PLoS Biol. 2004 Mar; 2(3):E69.
[PLoS Biol. 2004]Infect Immun. 2004 Aug; 72(8):4772-83.
[Infect Immun. 2004]Infect Immun. 2004 Jun; 72(6):3524-30.
[Infect Immun. 2004]Infect Immun. 2002 Apr; 70(4):2128-38.
[Infect Immun. 2002]J Biosci. 2004 Sep; 29(3):245-59.
[J Biosci. 2004]FEBS Lett. 1994 Mar 14; 341(1):1-4.
[FEBS Lett. 1994]Mol Microbiol. 2000 May; 36(4):775-83.
[Mol Microbiol. 2000]J Biol Chem. 2003 Jun 20; 278(25):22272-7.
[J Biol Chem. 2003]Nucleic Acids Res. 2004 Jan 1; 32(Database issue):D138-41.
[Nucleic Acids Res. 2004]Gene. 2000 May 2; 248(1-2):59-68.
[Gene. 2000]Infect Immun. 2001 Apr; 69(4):2083-91.
[Infect Immun. 2001]Infect Immun. 2002 Aug; 70(8):4701-4.
[Infect Immun. 2002]Proc Natl Acad Sci U S A. 2005 Jan 18; 102(3):844-9.
[Proc Natl Acad Sci U S A. 2005]Infect Immun. 1994 Jul; 62(7):2940-6.
[Infect Immun. 1994]Infect Immun. 2003 Apr; 71(4):1706-18.
[Infect Immun. 2003]Infect Immun. 2004 Feb; 72(2):659-66.
[Infect Immun. 2004]Infect Immun. 2002 Mar; 70(3):1175-84.
[Infect Immun. 2002]Infect Immun. 2003 Apr; 71(4):1706-18.
[Infect Immun. 2003]Infect Immun. 2003 Oct; 71(10):5650-61.
[Infect Immun. 2003]Infect Immun. 2002 Mar; 70(3):1175-84.
[Infect Immun. 2002]Infect Immun. 2004 Oct; 72(10):5574-81.
[Infect Immun. 2004]Infect Immun. 2004 Oct; 72(10):5574-81.
[Infect Immun. 2004]J Clin Microbiol. 2004 Aug; 42(8):3823-6.
[J Clin Microbiol. 2004]J Clin Microbiol. 2000 Jan; 38(1):369-74.
[J Clin Microbiol. 2000]J Clin Microbiol. 1999 May; 37(5):1447-53.
[J Clin Microbiol. 1999]EMBO J. 2001 Dec 3; 20(23):6735-41.
[EMBO J. 2001]Proc Natl Acad Sci U S A. 2005 Jan 18; 102(3):832-7.
[Proc Natl Acad Sci U S A. 2005]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]PLoS Biol. 2004 Mar; 2(3):E69.
[PLoS Biol. 2004]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]Microbiol Mol Biol Rev. 2004 Dec; 68(4):745-70.
[Microbiol Mol Biol Rev. 2004]Proc Natl Acad Sci U S A. 2005 Jan 18; 102(3):838-43.
[Proc Natl Acad Sci U S A. 2005]Microbes Infect. 2003 Jun; 5(7):621-7.
[Microbes Infect. 2003]Mol Microbiol. 2004 May; 52(3):903-16.
[Mol Microbiol. 2004]PLoS Biol. 2004 Mar; 2(3):E69.
[PLoS Biol. 2004]Infect Immun. 1998 Jan; 66(1):132-9.
[Infect Immun. 1998]Science. 1995 Jul 28; 269(5223):496-512.
[Science. 1995]Science. 2000 Mar 24; 287(5461):2196-204.
[Science. 2000]Nucleic Acids Res. 1998 Jan 15; 26(2):544-8.
[Nucleic Acids Res. 1998]Nucleic Acids Res. 1999 Dec 1; 27(23):4636-41.
[Nucleic Acids Res. 1999]Genome Biol. 2001; 2(8):RESEARCH0027.
[Genome Biol. 2001]Infect Immun. 2003 Oct; 71(10):5650-61.
[Infect Immun. 2003]J Biol Chem. 1999 Jun 18; 274(25):17828-36.
[J Biol Chem. 1999]Nature. 1998 Nov 12; 396(6707):133-40.
[Nature. 1998]J Bacteriol. 2004 Sep; 186(17):5842-55.
[J Bacteriol. 2004]Science. 2001 Sep 14; 293(5537):2093-8.
[Science. 2001]PLoS Biol. 2005 Apr; 3(4):e121.
[PLoS Biol. 2005]PLoS Biol. 2004 Mar; 2(3):E69.
[PLoS Biol. 2004]Bioinformatics. 2005 Feb 1; 21(3):293-306.
[Bioinformatics. 2005]Nucleic Acids Res. 2004 Jan 1; 32(Database issue):D284-8.
[Nucleic Acids Res. 2004]Nucleic Acids Res. 2004 Jan 1; 32(Database issue):D438-42.
[Nucleic Acids Res. 2004]Mol Biol Evol. 1996 May; 13(5):660-5.
[Mol Biol Evol. 1996]Syst Biol. 2003 Oct; 52(5):696-704.
[Syst Biol. 2003]Science. 2001 Jul 20; 293(5529):498-506.
[Science. 2001]Biotechniques. 2003 Feb; 34(2):374-8.
[Biotechniques. 2003]