![]() | ![]() |
Formats:
|
||||||||||||||||||||
Copyright © 2008 Podar et al.; licensee BioMed Central Ltd. A genomic analysis of the archaeal system Ignicoccus hospitalis-Nanoarchaeum equitans 1Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN 37831, USA 2DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA 3National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA 4Verenium Corporation, 4955 Directors Place, San Diego CA 92121, USA 5Division of Biological Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92037, USA 6Lehrstuhl für Mikrobiologie und Archaeenzentrum, Universität Regensburg, Universitätstraße 31, Regensburg, D-93053, Germany 7Genome Center, University of California Davis, One Shields Avenue, Davis, CA 95616, USA 8Current address: College of Agricultural, Consumer, and Environmental Sciences University of Illinois at Urbana-Champaign, 1101 W Peabody Dr., Urbana, IL 61801, USA 9Current address: Biology Department, San Diego State University, 5500 Campanile Drive San Diego, CA 92182, USA 10Current address: Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA Corresponding author.Mircea Podar: podarm/at/ornl.gov; Iain Anderson: IJAnderson/at/lbl.gov; Kira S Makarova: makarova/at/ncbi.nlm.nih.gov; James G Elkins: elkinsjg/at/ornl.gov; Natalia Ivanova: NNIvanova/at/lbl.gov; Mark A Wall: Mark.Wall/at/verenium.com; Athanasios Lykidis: ALykidis/at/lbl.gov; Kostantinos Mavromatis: KMavrommatis/at/lbl.gov; Hui Sun: hsun/at/lbl.gov; Matthew E Hudson: mhudson/at/illinois.edu; Wenqiong Chen: joanchen/at/sciences.sdsu.edu; Cosmin Deciu: Cosmin.Deciu/at/verenium.com; Don Hutchison: Don.Hutchison/at/verenium.com; Jonathan R Eads: jonathan_eads/at/hotmail.com; Abraham Anderson: abraham.anderson/at/yahoo.com; Fillipe Fernandes: fancfernandes/at/gmail.com; Ernest Szeto: eszeto/at/lbl.gov; Alla Lapidus: alapidus/at/lbl.gov; Nikos C Kyrpides: nckyrpides/at/lbl.gov; Milton H Saier, Jr: msaier/at/ucsd.edu; Paul M Richardson: PaulRichardson/at/progentech.com; Reinhard Rachel: reinhard.rachel/at/biologie.uni-regensburg.de; Harald Huber: harald.huber/at/biologie.uni-regensburg.de; Jonathan A Eisen: jaeisen/at/ucdavis.edu; Eugene V Koonin: koonin/at/ncbi.nlm.nih.gov; Martin Keller: kellerm/at/ornl.gov; Karl O Stetter: karl.stetter/at/biologie.uni-regensburg.de Received September 5, 2008; Revised October 21, 2008; Accepted November 10, 2008. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Background The relationship between the hyperthermophiles Ignicoccus hospitalis and Nanoarchaeum equitans is the only known example of a specific association between two species of Archaea. Little is known about the mechanisms that enable this relationship. Results We sequenced the complete genome of I. hospitalis and found it to be the smallest among independent, free-living organisms. A comparative genomic reconstruction suggests that the I. hospitalis lineage has lost most of the genes associated with a heterotrophic metabolism that is characteristic of most of the Crenarchaeota. A streamlined genome is also suggested by a low frequency of paralogs and fragmentation of many operons. However, this process appears to be partially balanced by lateral gene transfer from archaeal and bacterial sources. Conclusions A combination of genomic and cellular features suggests highly efficient adaptation to the low energy yield of sulfur-hydrogen respiration and efficient inorganic carbon and nitrogen assimilation. Evidence of lateral gene exchange between N. equitans and I. hospitalis indicates that the relationship has impacted both genomes. This association is the simplest symbiotic system known to date and a unique model for studying mechanisms of interspecific relationships at the genomic and metabolic levels. Background The crenarchaeaote Ignicoccus hospitalis is a specific host for Nanoarchaeum equitans in a relationship that is thus far unique, involving two archaeal species [1-3]. Ignicoccus species have a chemoautotrophic metabolism that couples CO2 fixation with sulfur respiration using molecular hydrogen in high temperature hydrothermal vent systems and thus might resemble organisms that thrived on the primitive, hot and anoxic Earth [4-8]. Uniquely among Archaea, Ignicoccus cells are surrounded by two membranes separated by a wide periplasmic space within which vesicles and tubular structures emerge from the cytoplasm [9]. Some of these structures reach and fuse with the outer membrane [10], which has a distinct lipid composition and contains pores of a unique type [11]. The physiological significance of these features and their potential involvement in the relationship with N. equitans are unknown. With a highly reduced genome, N. equitans has virtually no obvious metabolic or energetic capabilities and, using unknown mechanisms, must obtain metabolites and energy from I. hospitalis by attaching to its surface [3,12,13]. The similarity of the lipid compositions between the cytoplasmic membranes of I. hospitalis and N. equitans suggests specific lipid partitioning and transport mechanisms [13]. In addition, carbon labeling and cell fractionation have demonstrated the transfer of amino acids from I. hospitalis to N. equitans [3]. In co-cultures with I. hospitalis, N. equitans cells can be regularly observed detached and, for some time, they appear to maintain their membrane integrity, at least based on live-dead staining [3]. The mechanism of separation from the host cell and the potential existence of a reattachment process are still unknown. Attempts to propagate N. equitans in co-cultures with other archaea, including other species of Ignicoccus, have not been successful, suggesting that the relationship with I. hospitalis is highly specific and involves a recognition mechanism [3]. While under laboratory conditions the effects exerted by N. equitans on its host range from mildly to moderately inhibitory [1,3], Nanoarchaeum might confer on Ignicoccus an advantage in colonizing hydrothermal vents [14]. As its exact nature remains elusive, provisionally describing this relationship as a symbiosis is compatible with representing either a novel type of interspecific association or fitting within recognized categories of microbial interactions [15]. It has been proposed that genomic characteristics of N. equitans such as the numerous split genes and extremely compact genome might be signatures of an ancient lineage [12,16], although a viable alternative seems to be that at least some of these features are secondarily derived [17]. The age of the Ignicoccus-Nanoarchaeum relationship is unknown, although both organisms represent hyperthermophilic lineages and inhabit types of ecosystems that are often considered to be ancient [7,18]. This system provides insights into physiological mechanisms of interaction between unicellular organisms and can offer clues to evolutionary events that shape the genomes of symbionts leading to physiological interdependence. The Ignicoccus-Nanoarchaeum relationship might even serve as an analogous model to proposed symbiotic events that could have led to the formation of eukaryotic cells [19]. To advance the study of this relationship at the genomic level, we sequenced the complete genome of I. hospitalis, complementing that of N. equitans [12]. In this study, in conjunction with the available physiological and morphological data, we performed the genomic analysis and metabolic reconstruction of I. hospitalis, as a step to deciphering the evolutionary history and the molecular mechanisms that enable the symbiotic relationship between the two archaea. Results and discussion A minimal genome The genome of I. hospitalis consists of a single circular chromosome (Table 1). At 1,297,538 bp, the genome of I. hospitalis is the smallest among free-living organisms, which do not require a continuous association with another species and can replicate independently (Figure (Figure1).1
The sizes of microbial genomes are the result of dynamic equilibria between contraction by deletions and expansion due to duplications, lateral gene transfer and insertion of mobile DNA. For free-living organisms with very large effective population sizes, genome streamlining is likely to be a selective consequence of reducing the metabolic burden to maintain DNA of little adaptive value, as illustrated by the genomes of such highly successful and widespread lineages as Prochlorococcus and Pelagibacter [20,21]. An alternative (but not necessarily exclusive) hypothesis links genome reduction to elevated mutation rates in large populations. Accumulation of mutations could lead to inactivation and loss of genes that make weak contribution to the fitness of the respective organisms [22]. Ignicoccus, however, inhabits heterogeneous, geographically dispersed and relatively ephemeral hydrothermal marine environments. Such organisms generally have small effective populations and experience periodic bottlenecks and limited gene flow [23]. Conceivably, in a case like this, genome contraction might have to do with the very active recombination and DNA repair that organisms inhabiting extreme environments employ for maintaining genomic integrity. Frequent recombination might not only efficiently remove deleterious mutations induced by the environmental conditions but also generate diversity and increase the fixation rate of adaptive alleles [24,25]. A high frequency of illegitimate, intra-chromosome recombination could also be effective in preventing genome expansion by increasing the frequency of deletions and counteracting gene duplication. This might explain the reduced genome size in many members of the Archaea and contribute to their proposed higher adaptability to chronic energy stress [26]. While we expect these general principles to be valid in the Nanoarchaeum-Ignicoccus system as well, the co-evolution of these two organisms also left unique imprints on their physiology [2,3]. The most striking effect of this co-evolution, however, is the massive gene loss in N. equitans, resembling that of obligate intracellular bacterial symbionts and, as an extreme case, that of eukaryotic organelles [12]. The recently published database of archaeal clusters of orthologous genes (arCOGs) provides a framework for comparing the I. hospitalis genomic data to genes from 41 previously sequenced archaeal genomes organized into sets of probable orthologs [27]. Of the 1,434 annotated I. hospitalis protein-coding genes, 1,155 (80.5%) were assigned to arCOGs, a coverage that is the lowest among the Desulfurococcales (85% on average) and overall among thermophilic Crenarchaeota. I. hospitalis lacks orthologs of 19 genes from the Crenarchaeota core (that is, genes that are represented in all 12 available genomes of thermophilic species of Crenarchaeota included in the arCOGs) [27] (Table S1 in Additional data file 1). None of these genes include components of information processing systems, indicating that these systems are largely intact in I. hospitalis despite the small genome. The missing genes encode, primarily, diverse metabolic enzymes, some of which - for example, thymidylate kinase - catalyze essential reactions. Conceivably, these enzymes are substituted for by distant homologs that so far remain undetected or by analogs. Using the assignment of I. hospitalis genes to arCOGs, we applied weighted parsimony to perform a reconstruction of gene gain and loss events in archaea [27,28], with an emphasis on the I. hospitalis lineage. The small genome size appears to be a result of gene loss that has vastly predominated the evolution of this lineage: it was inferred that approximately 484 arCOGs were lost, compared to the inferred gain of only 56. Approximately 946 arCOGs (1,094 genes, representing 76% of the I. hospitalis gene set) appear to have been inherited from the last common ancestor of the Desulforococcales, the order to which Ignicoccus belongs, together with Aeropyrum pernix, Hyperthermus butylicus and Staphylothermus marinus. The functional distribution of the lost genes is consistent with the fact that I. hospitalis is an obligate anaerobic autotroph. In contrast to A. pernix, numerous genes related to aerobic metabolism as well as catabolism and transport of amino acids, sugar and nucleotides were lost, along with many transcriptional regulators (Figure (Figure2;2
The reduced frequency of duplicated genes (paralogs) in I. hospitalis compared to all other archaea except N. equitans (Figure (Figure3)3
In addition to streamlining, selection for reducing metabolic cost in I. hospitalis may have impacted its proteome composition. In hyperthermophiles, certain biases in amino acid usage have been associated with side chain physical and chemical properties that contribute to increased protein stability [34,35]. For example, a preference for lysine over arginine has been attributed to a greater flexibility of the lysine side chain, which entropically stabilizes the folded state of proteins [36]. While the overall amino acid usage in the N. equitans-I. hospitalis proteomes follows the distribution observed for other hyperthemophiles, there is a significant increase in lysine over arginine usage in I. hospitalis relative to the values that could be predicted from the GC content (Figure (Figure4;4
Lateral gene transfer The cell-cell contact between I. hospitalis and N. equitans seems to present an opportunity for extensive lateral gene transfer (LGT). LGT is considered to play a major role in microbial genome evolution and is well-documented in symbiotic systems and in environmental microbial communities [39-42]. Recent LGT events are readily detected with various methods based on nucleotide composition or codon usage, but methods that rely on protein sequence similarity and phylogenetic trees are more informative for ancient LGT events [43]. To analyze the I. hospitalis genome for potential LGT events, we therefore combined automatic genome-wide phylogenetic reconstruction using PyPhy [44] with similarity searches and COG distribution analysis. The LGT candidates were further analyzed using hand-curated alignments and maximum likelihood phylogenetic analyses. Identifying the LGT direction requires analysis of conflicts between the topologies of the corresponding gene trees and the adopted species tree. The position of N. equitans within the Archaea is controversial and ranges from representing a distinct and basal phylum [1,12,16] to being a derived member of order Thermococcales from the Euryarchaeota [17]. Many gene trees identify the Thermococcales as an early diverging lineage, which further complicates this distinction. Ignicoccus on the other hand has been confidently assigned to order Desulfurococcales from the Crenarchaeota based on phylogenetic and arCOG analysis. Therefore, when attempting to infer direction of LGT, we relied on the phylogenetic placing of N. equitans and I. hospitalis genes relative to other crenarchaeal homologues, especially those from the Desulfurococcales (Aeropyrum, Hyperthermus and Staphylothermus). A small fraction of I. hospitalis genes (approximately 6%) appear to have been transferred from lineages within Euryarchaeota, while approximately 4% seem to be of bacterial origin (Figure (Figure5).5
One of the possible outcomes of LGT in symbiotic associations involves orthologous gene displacement in the recipient genome and maintenance of the gene in the donor genome as well. In the N. equitans-I. hospitalis system, we identified 13 such cases, in which the orthologs in both genomes are each other's closest homologues (Figure (Figure5).5
A similar case of lateral transfer likely involved the gene encoding leucyl aminopeptidase (LAP), Igni738-Neq412 (Figure (Figure6b).6b Genetic information processing in I. hospitalis, as inferred from the genome sequence, is typical of the Crenarchaeota. Orthologs of two family B DNA polymerases are present in the genome (Igni62, 690); one corresponds to the aphidicolin-resistant DNA polymerase I (polA), and the other to the aphidicolin-sensitive DNA polymerase II (polB) of Aeropyrum pernix [49]. No orthologs of the third family B DNA polymerase or Euryarchaeota-type heterodimeric DNA polymerase were found. Unlike other archaeal genomes, the genes coding for replication initiation/origin recognition factor (Orc1/Cdc6) are not co-localized with the predicted origin of replication [50,51], a characteristic potentially related to general operon fragmentation in I. hospitalis. Unlike other archaea, including I. hospitalis, that possess DNA primases consisting of a small (catalytic) and large (structural) subunits, N. equitans seems to encode a single-subunit primase (NEQ395) in which the small subunit is fused to the carboxy-terminal domain of the large subunit [52] (EVK, unpublished observations). This may be the result of extreme genome contraction in this organism, possibly linked to its symbiotic lifestyle. Similarly, an important molecular machine absent in N. equitans but present in I. hospitalis is the RNase P complex (RNA and four separate proteins subunits, rpp14, 21, 29 and 30). It has been recently shown that tRNA processing in N. equitans is RNase P-independent, most likely because genome shrinkage led to the evolution of leaderless tRNAs that was followed by the loss of all five RNAse P complex genes [53]. Transport processes The membrane composition of hyperthermophiles is specifically adapted to reduce proton and ion permeability, which increase with temperature [54]. Cyclic tetraether-type lipids (caldarchaeol) that are present in the cytoplasmic membrane of I. hospitalis and in the cell membrane of N. equitans are especially associated with low permeability [13]. In contrast, the absence of caldarchaeol in the outer membrane of Ignicoccus and the presence of protein pores [11] indicate potentially less restrictive exchanges with the environment through the outer membrane. With only eight types of transporters, almost all predicted to be specific for inorganic ions or export of intracellular solutes (Figure (Figure7),7
While some proteins may spontaneously insert in the membrane, most transport into and across the membrane requires the function of specialized cellular systems [57]. All the components of the Sec pathway were identified in the I. hospitalis genome, including the 7S RNA gene component of the signal recognition particle (Figure (Figure7).7 Central metabolism I. hospitalis is the first archaeon with sulfur-based autotrophy for which a complete genome sequence is hereby reported. Metabolic reconstruction (Figure (Figure7)7 Pathways for the synthesis of almost all amino acids can be recognized in the I. hospitalis genome, with the exception of proline and homocysteine. Some of the enzymatic activities involved in I. hospitalis amino acid biosyntheses have been detected experimentally and labeling experiments have been used to reconstruct most the pathways [66]. The genome also encodes the predicted enzymes of purine, pyrimidine, NAD, riboflavin/FAD, pyridoxal and CoA biosynthesis. The mevalonate pathway for the synthesis of the characteristic archaeal membrane archaeol- and caldarchaeol-type lipids appears to be complete (Figure (Figure7),7 I. hospitalis utilizes a novel and so far unique autotrophic CO2fixation pathway, termed the dicarboxylate/4-hydroxybutyrate cycle [69]. The individual steps of the pathway have been investigated experimentally in detail and most have been confirmed biochemically [66,69,70] (Figure (Figure7).7 The archaeal-type PEP carboxylase [71] catalyzes the second CO2 incorporation reaction, which results in the formation of oxaloacetate, an important precursor for amino acid biosynthetic pathways (Figure (Figure7).7 Phylogenetic analysis of the two I. hospitalis gene clusters encoding oxoacid:ferredoxin oxidoreductase complexes indicates that one of them (Igni1256-1259) belongs to the pyruvate:ferredoxin oxidoreductase family and, therefore, is the likely catalyst for acetyl-CoA carboxylation. The other complex (Igni1075-1078) has a close affinity to a family with oxoglutarate specificity with no close homologs in Crenarchaeota (Figure S3 in Additional data file 2), suggesting acquisition by lateral transfer. The functional inference is based on phylogenetic partitioning of archaeal oxoacid:ferredoxin oxidoreductase genes into distinct clades that correspond to enzymes specific for pyruvate, valerate/isovalerate, or 2-oxoglutarate or that have mixed specificity [72-74]. In addition, alignments of the I. hospitalis alpha and beta subunit sequences revealed the presence of motifs conserved in archaeal and bacterial enzymes specific for pyruvate (Igni 1258-1259) or 2-oxoglutarate (Igni 1077-1078) [75-77] (Figure S4 in Additional data file 2). The function of the predicted 2-oxoglutarate:ferredoxin oxidoreductase (OGOR) complex in I. hospitalis remains unclear. 2-Oxoglutarate serves as an entry point in glutamate and lysine biosynthesis and is also linked to the biosynthesis of several other amino acids as shown by carbon tracing and inferred from genomic data [66] (Figure (Figure7).7 Respiration and energetic metabolism Under laboratory conditions, the only energy yielding reaction that sustains the metabolism of I. hospitalis is the oxidation of molecular hydrogen coupled to the reduction of elemental sulfur. While energetically weak (-6.7 kcal/mol) [38], there are indications that this type of respiration might have been used by ancient microbes of the early Archaean [5]. Details of bioenergetic reactions and the mechanisms for generating the membrane chemiosmotic potential in anaerobic hyperthermophilic archaea are still not well understood. Minimal enzymatic components that are required include a membrane hydrogenase complex, a sulfur reductase and an electron transport chain between them. In I. hospitalis, there appear to be two clusters of genes encoding subunits of the sulfur/polysulfide reductase complex. The first such cluster (Igni801-803) contains the catalytic reductase (SreA), a 4Fe-4S ferredoxin (SreB) and the membrane anchoring component NrfD (SreC) with eight transmembrane domains. NrfD is thought to participate in the transfer of electrons from the quinone pool into the terminal components of the Nrf pathway. Elsewhere in the genome, a gene cluster (Igni528-530) that appears to be of bacterial origin contains a different NrfD, a periplasmic FeS ferredoxin, as well as a membrane protein with four putative heme binding sites that may serve in the electron transfer chain through the membrane, possibly binding menaquinone. This gene cluster is also present in the related archaeon Hyperthermus butylicus [81], suggesting the possibility that it was transferred between the two archaeal lineages after one of them likely acquired it from a delta proteobacterium. Two types of reductase complexes might therefore assemble in I. hospitalis, archaeal and bacterial. In other sulfur reducers a periplasmic polysulfide-sulfur transferase (a member of the rhodanese family) facilitates the transfer of low concentrations of polysulfide to the reductase. I. hospitalis is the only crenarchaeote that is missing a rhodanese family gene. This could be a result of growing under relatively neutral pH, where polysulfide concentrations may be high enough. Therefore, access of polysulfide to the cytoplasmic membrane, where the reductase complex is likely located, could occur by diffusion across the large periplasmic space after passage though the outer membrane pores. Ignicoccus depends on molecular hydrogen as the sole electron donor. A single predicted operon contains the genes encoding the large and small subunits of a hydrogen uptake NiFe hydrogenase, including the large and small subunits (Igni1366-1369). The heterodimer is exported to the periplasm through the twin-arginine translocation (TAT) system and is assembled with a 4Fe-4S ferredoxin and a membrane protein anchor containing histidine residues that might bind a b-type heme [82]. The formation of the metal-containing active site and the assembly of the hydrogenase is a complex process requiring multiple accessory proteins [83], all of which appear to be encoded in the I. hospitalis genome (Figure (Figure7).7 The I. hospitalis genome also contains a four gene putative operon with close homologues among the bacterial respiratory periplasmic nitrate reductases (Igni1377-1380). Similarity to formate dehydrogenases was also detected, so the function of the complex is not clear, as nitrate cannot serve as a sole electron acceptor in Ignicoccus [2,60]. In bacteria, depending on the composition of the complex, periplasmic nitrate reduction can either contribute to the generation of the proton gradient or serve as an electron sink, eliminating excess reducing equivalents from the cytoplasm [85]. A complete membrane A-type ATPase is predicted to be encoded in the genome of I. hospitalis, in contrast with only a subset of subunits in N. equitans [12]. While N. equitans might be unable to synthesize ATP, the presence of a predicted nucleoside diphosphate kinase (Neq307) suggests that regeneration of the NDP pool is feasible, which might reduce its host dependency by recycling (Figure (Figure7).7 As an obligate anaerobe, I. hospitalis requires a mechanism to deal with the toxicity of reactive oxygen species. A superoxide reductase is present (Igni1348) and could detoxify superoxide resulting from oxygen reduction by transition metals. According to a recently proposed mechanism [86], a ferrocyanide complex bound within the superoxide reductase active site may scavenge the superoxide by one-electron redox chemistry while the superoxide reductase iron site remains reduced. The resulting peroxide could be transferred to soluble organic compounds, resulting in the formation of alkyl peroxides that can be reduced by peroxiredoxin. A gene encoding a member of this family is encoded in the genome (Igni459) and a recent proteomic analysis of I. hospitalis in laboratory cultures has shown that its product is an abundant cytosolic protein [55]. Potential molecular and structural determinants of the I. hospitalis-N. equitans interaction Although the recognition and exchange mechanisms between I. hospitalis and N. equitans remain elusive, the available genomic and ultra-structural data suggest some possible ways of interaction between the two organisms. Since the transporters in both species are few and provide limited specificities, they are unlikely to comprise the main route of metabolite acquisition by N. equitans. Similarly, transfer of protein complexes to N. equitans from the host by secretion, especially for membrane components, would violate topological and signal sequence constraints of the translocation machinery. Potential vehicles for the transport of metabolites and proteins from I. hospitalis to N. equitans appear to be the large and variably shaped vesicles and tubes that emerge from the host's cytoplasm [9,10]. Such structures could provide transient or even constant contact between the two cytoplasms once the physical contact between the cells has been established, possibly fulfilling the metabolic and energetic requirements of N. equitans. This would also allow it to carry out limited respiration, transport and ATP synthesis and may explain how detached N. equitans cells or cells not in direct contact with the host can survive for some time. Electron microscopy studies have indicated that some of the I. hospitalis periplasmic vesicles fuse with the outer membrane, which likely results in their contents being released into the environment [9,10]. This release of small molecules and, perhaps, peptides might provide chemical cues to N. equitans for host recognition and attachment. Since neither of the two organisms appears to be motile, the actual mechanism by which they find each other and become attached in the turbulent hydrothermal vent environment remains enigmatic. Recent ultra-structural and physiological studies have shown that a physical connection can form between the two organisms [3,62]. Three-dimensional reconstructions point to a dynamic type of interaction, some N. equitans cells contacting the outer membrane of I. hospitalis in places where the host periplasmic space is wide and contains cytoplasmic vesicles while others are attached to regions with a very narrow periplasm and displaying fibrilar structures [62]. The steps and molecular determinants of the cell-cell recognition and interaction and the membrane and periplasm dynamics remain uncharacterized. The cytoplasmic membrane of Ignicoccus itself is highly 'corrugated', as shown in sections and three-dimensional reconstructions, thereby increasing its surface significantly; in addition, it spontaneously evaginates in the absence of N. equitans [2,9,10,62]. Therefore, the physiological role of the conglomerate of tubes and vesicles and the significance of the wide periplasmic space probably extends beyond their possible connection to N. equitans. As energy generation resides at the level of the cytoplasmic membrane, these structures could provide a substantially increased respiratory surface confined in the space surrounded by the outer membrane, analogous to the eukaryotic mitochondrial cristae. Vesicles might concurrently transport specific lipids and proteins to the outer porous membrane, which in this case would serve not only as a protective barrier but also for controlling gas and solute exchange. This could represent a mechanism enabling Ignicoccus species to rely exclusively on the low energetic yield of the sulfur-hydrogen respiration to sustain an elevated turnover of cellular components at high temperature. Combined with the obligate CO2 autotrophy and efficient metabolism, such adaptations might allow Ignicoccus to outcompete heterotrophs in colonizing emerging hydrothermal vent niches that are still poor in dissolved organic compounds. Conclusion The combinations of ecophysiological and morphological features that collectively enable the I. hospitalis-N. equitans relationship are encoded within a surprisingly simple genomic blueprint. The genome of I. hospitalis is the smallest among free-living bacteria and archaea, shows evidence of gene exchange with N. equitans and encodes streamlined biochemical functions necessary for a chemoautotrophic metabolism relying on carbon dioxide, hydrogen and sulfur. Aside from selection pressure against genome expansion in a restrictive environmental niche, the two organisms have coevolved, leading to symbiotic specificity and gene exchange. In addition, I. hospitalis appears to have acquired a significant number of genes and predicted operons from Bacteria and Euryarchaeota, some of them encoding membrane-associated complexes involved in transport and energetic metabolism. This unicellular symbiotic system might resemble relationships that gave rise to eukaryotic organelles. The availability of complete genomic data for both organisms opens the possibility to study interspecific gene regulatory networks and identify proteins that might be exchanged between interacting cells. Materials and methods Genome sequencing and functional annotation I. hospitalis KIN4I cells (DSMZ strain 18386) were grown as described in [2]. DNA was isolated from frozen cells using an alkaline lysis followed by proteinase K digestion method [87]. Sequencing and assembly were performed at the DOE Joint Genome Institute, Walnut Creek, CA, USA using the standard microbial genome sequencing pipeline [88] based on a combination of 3-, 6- and 40-kb (fosmid) DNA libraries. The Phred/Phrap/Consed software package was used to assemble and assess quality [89]. Possible miss-assemblies were corrected and gaps between contigs were closed by editing in Consed, custom primer walks or PCR amplification and sequencing. The estimated error rate in the completed genome sequence of I. hospitalis is less than 1 in 50 kbp. Automated gene prediction was performed by using the output of Critica complemented with the output of Glimmer as part of the genome annotation pipeline at Oak Ridge National Laboratory (ORNL), Oak Ridge, TN, USA. The predicted coding sequences were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. The tRNAScanSE tool [90] was used to find tRNA genes, whereas ribosomal RNAs were found by using BLASTn against the ribosomal RNA databases. The RNA components of the protein secretion complex and the RNaseP were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [91]. Transporter proteins were initially identified based on similarity to transporter categories in GOG and Pfam and were further analyzed using the Transporter Classification Database [92]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform developed by the Joint Genome Institute, Walnut Creek, CA, USA [93]. The complete genome sequence has been deposited in GenBank [GenBank:CP000816.1]. Comparative genomic analysis Analysis of the I. hospitalis and N. equitans genomes were carried out using the IMG system [93]. The genes referred to throughout the text and figures correspond to the assigned open reading frame numbers in the two genomes. Putative operons were identified using the method of Overbeek et al. [94]. Structure fold prediction of membrane proteins with no detectable similarity to other database sequences was performed using Phyre [95]. To calculate the frequency of paralogs in the different archaeal genomes, blastclust analyses were performed using the translated coding sequences and varying the similarity threshold for sequence inclusion into clusters. To derive the amino acid usage statistics for archaeal genomes, the percentages of amino acids encoded within each protein were first calculated and used to determine the overall percentage for the whole proteome. The frequency for each amino acid use was then analyzed graphically relative to the GC content of the genome considering that the GC content can influence codon usage. Archaeal COG analyses for I. hospitalis and N. equitans were performed as described [27]. Analysis of genome sizes of bacteria and archaea was based on genomic data available in IMG (March 2008 version). A table containing the accession numbers for all the genomes as well as all the numerical parameters and classification used in the analysis is provided as Additional data file 3. Phylogenetic analysis To identify the potential presence of laterally transferred genes in I. hospitalis, we first used the Pyphy system [44] to automatically calculate individual phylogenetic trees for every gene in the genome. Briefly, each protein sequence was blasted against a local version of a non-redundant protein database (SWISS and TREMBL) and sequences with significant hits (<10e-6) were retrieved and aligned with the query sequence using CLUSTALW. Phylogenetic trees were then constructed using PAUP* with the neighbor joining and parsimony methods with 100 bootstrap replicates. Because the automatic 'phylogenetic connection' calculated by Pyphy and displayed as the phylome map of the genome was at times affected by poor bootstrap support values or unresolved trees, we visually inspected each tree and, when sufficient confidence was present, a broad phylogenetic connection to the Crenarchaeaota, Euryarchaeota or Bacteria was assigned to the I. hospitalis gene. For numerous genes, although the Ignicoccus gene was clearly of archaeal type, either the phylogenetic signal was insufficient or the evolutionary history of that gene across Archaea involved numerous potential LGTs. Such genes have been generically designated as 'archaeal'. When no close homologues for I. hospitalis genes were found or the phylogenetic trees included archaeal and bacterial genes but were not sufficiently resolved, such genes were designated as 'unknown' phylogenetic type. Finally, when the closest hit and the resulting phylogenetic trees indicated a N. equitans gene as the closest homologue, those genes were designated as potential LGTs within the N. equitans-I. hospitalis system. We also used the arCOG analysis to improve the phylotyping information for some of the functional gene categories that were not resolved by phylogenetic analysis. Genes representing potential LGTs within the N. equitans-I. hospitalis system were subjected to a more extensive phylogenetic analysis. Sequence alignments were obtained using a combination of alternative methods as implemented on the M-Coffee web server [96]. Following manual alignment curation and masking of regions with high variability that could not be confidently aligned, the amino acid substitution model best fit for each gene was chosen using the software Modelgenerator v84 [97]. Maximum likelihood phylogenetic trees were constructed using PhyML v2.4.4 [98] using the parameters identified by Modelgenerator. Alternative tree topologies were also explored using a combination of the software Tree Puzzle and PROML/PHYLIP, as previously described [99]. The protein sequence alignments used to generate the trees for several inferred laterally transferred genes are provided as Additional data file 4. Abbreviations arCOGs: archaeal cluster of orthologous groups; GDH: glutamate dehydrogenase; GS: glutamine synthase; IMG: Integrated Microbial Genomes; LAP: leucyl aminopeptidase; LGT: lateral gene transfer; OGOR: 2-oxoglutarate:ferredoxin oxidoreductase. Authors' contributions MP and KOS conceived and coordinated the study. HS, DH, JRE, ES, AL and PR coordinated and conducted genome sequencing, assembly and sequence data management. MP, IA, KSM, JGE, NI, MEH, MW, AL, KM, WC, AA, NK, MS and EVK performed sequence annotation, comparative genomics and functional inference analyses. MP, KSM, CD and FF performed phylogenetic and phylogenomic analyses. All authors analyzed the results and participated in writing sections of the manuscript. MP assembled and wrote the final version of the manuscript. Additional data files The following additional data are available with the online version of this paper. Additional data file 1 contains a table listing the inferred creanarchaeal core genes lost by I. hospitalis (Table S1), a table listing functional categories gained and lost in the I. hospitalis genome (Table S2), a table of functional gene categories (arCOGs) present in N. equitans but absent in I. hospitalis and their distribution in archaeal genomes (Table S3) and a table listing the gene family expansions in the I. hospitalis genome (Table S4). Additional data file 2 contains a phylogenetic tree of cultivated thermophilic species of Crenarchaeota based on SSU rRNA sequences (Figure S1), phylogenetic trees of archaeal tyrosyl-tRNA synthetases and of family IV endonucleases (Figure S2), a phylogenetic tree of the alpha subunit of archaeal 2-oxoacid: ferredoxin oxidoreductases (Figure S3) and an amino acid-based sequence alignment of conserved regions of the alpha and beta subunits of pyruvate:ferredoxin oxidoreductases and OGORs (Figure S4). Additional data file 3 contains numerical and classification data associated with all the bacterial and archaeal genomes used in genome size analysis. Additional data file 4 contains the protein sequence alignments used to infer lateral gene transfer of valyl t-RNA synthetase, leucyl aminopeptidase, tyrosyl t-RNA synthetase and endonuclease IV, in phylip format. Additional data file 1 Table S1: the inferred creanarchaeal core genes lost by I. hospitalis. Table S2: functional categories gained and lost in the I. hospitalis genome. Table S3: functional gene categories (arCOGs) present in N. equitans but absent in I. hospitalis and their distribution in archaeal genomes. Table S4: the gene family expansions in the I. hospitalis genome. Click here for file(133K, pdf) Additional data file 2 Figure S1: a phylogenetic tree of cultivated thermophilic species of Crenarchaeota based on SSU rRNA sequences. Figure S2: phylogenetic trees of archaeal tyrosyl-tRNA synthetases and of family IV endonucleases. Figure S3: a phylogenetic tree of the alpha subunit of archaeal 2-oxoacid: ferredoxin oxidoreductases. Figure S4: amino acid-based sequence alignment of conserved regions of the alpha and beta subunits of pyruvate:ferredoxin oxidoreductases and OGORs. Click here for file(989K, pdf) Additional data file 3 Numerical and classification data associated with all the bacterial and archaeal genomes used in genome size analysis. Click here for file(622K, xls) Additional data file 4 Protein sequence alignments used to infer lateral gene transfer of valyl t-RNA synthetase, leucyl aminopeptidase, tyrosyl t-RNA synthetase and endonuclease IV, in phylip format. Click here for file(43K, phy) Acknowledgements We thank Diversa/Verenium Corporation (San Diego, CA), JGI production sequencing group and the Computational Biology Group at Oak Ridge National Laboratory (Oak Ridge, TN) for sequencing and annotation support. MP, JGE and MK were supported by the US Department of Energy, Office of Science, Biological and Environmental Research programs at Oak Ridge National Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. Support for sequencing and data analysis was provided by the Joint Genome Institute, the US Department of Energy (IA, NI, AL, KM, HS, ES, AL, NK and PR). Diversa Corporation provided support for MP, MW, WC, CD, DH, JRE, AA and FF. KSM and EVK are supported by the Intramural Research Program of the National Institutes of Health, National Library of Medicine. HH, RR and KOS were supported by grants from the Deutsche Forschungsgemeinschaft. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||
Nature. 2002 May 2; 417(6884):63-7.
[Nature. 2002]J Bacteriol. 2008 Mar; 190(5):1743-50.
[J Bacteriol. 2008]Philos Trans R Soc Lond B Biol Sci. 2007 Oct 29; 362(1486):1887-925.
[Philos Trans R Soc Lond B Biol Sci. 2007]Philos Trans R Soc Lond B Biol Sci. 2006 Oct 29; 361(1474):1787-806; discussion 1806-8.
[Philos Trans R Soc Lond B Biol Sci. 2006]Archaea. 2002 Mar; 1(1):9-18.
[Archaea. 2002]J Bacteriol. 2008 Mar; 190(5):1743-50.
[J Bacteriol. 2008]Proc Natl Acad Sci U S A. 2003 Oct 28; 100(22):12984-8.
[Proc Natl Acad Sci U S A. 2003]Arch Microbiol. 2004 Nov; 182(5):404-13.
[Arch Microbiol. 2004]Nature. 2002 May 2; 417(6884):63-7.
[Nature. 2002]Environ Microbiol. 2006 Jan; 8(1):114-25.
[Environ Microbiol. 2006]Proc Natl Acad Sci U S A. 2003 Oct 28; 100(22):12984-8.
[Proc Natl Acad Sci U S A. 2003]J Theor Biol. 2006 Sep 7; 242(1):257-60.
[J Theor Biol. 2006]Genome Biol. 2005; 6(5):R42.
[Genome Biol. 2005]Philos Trans R Soc Lond B Biol Sci. 2006 Oct 29; 361(1474):1837-42; discussion 1842-3.
[Philos Trans R Soc Lond B Biol Sci. 2006]Science. 2002 May 10; 296(5570):1077-82.
[Science. 2002]Science. 2005 Aug 19; 309(5738):1242-5.
[Science. 2005]Genome Biol. 2005; 6(2):R14.
[Genome Biol. 2005]Genetica. 2008 Oct; 134(2):205-10.
[Genetica. 2008]Science. 2003 Aug 15; 301(5635):976-8.
[Science. 2003]Proc Natl Acad Sci U S A. 2007 Feb 6; 104(6):1883-8.
[Proc Natl Acad Sci U S A. 2007]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]BMC Evol Biol. 2003 Jan 6; 3():2.
[BMC Evol Biol. 2003]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]Trends Biochem Sci. 1999 May; 24(5):181-5.
[Trends Biochem Sci. 1999]J Mol Biol. 1999 Jun 18; 289(4):729-45.
[J Mol Biol. 1999]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]J Mol Biol. 2001 Apr 13; 307(5):1271-92.
[J Mol Biol. 2001]Trends Biochem Sci. 1997 Jan; 22(1):12-3.
[Trends Biochem Sci. 1997]Extremophiles. 2007 Jul; 11(4):585-96.
[Extremophiles. 2007]PLoS Comput Biol. 2007 Jan 12; 3(1):e5.
[PLoS Comput Biol. 2007]PLoS Comput Biol. 2007 Mar 23; 3(3):e52.
[PLoS Comput Biol. 2007]Science. 2005 Aug 19; 309(5738):1242-5.
[Science. 2005]Nat Rev Microbiol. 2007 Apr; 5(4):316-23.
[Nat Rev Microbiol. 2007]Annu Rev Microbiol. 2001; 55():709-42.
[Annu Rev Microbiol. 2001]Science. 2007 Sep 21; 317(5845):1753-6.
[Science. 2007]BMC Evol Biol. 2007 Mar 21; 7():45.
[BMC Evol Biol. 2007]Nucleic Acids Res. 2001 Jan 15; 29(2):545-52.
[Nucleic Acids Res. 2001]Nature. 2002 May 2; 417(6884):63-7.
[Nature. 2002]Genome Biol. 2005; 6(5):R42.
[Genome Biol. 2005]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]Genome Res. 1999 Aug; 9(8):689-710.
[Genome Res. 1999]Microbiol Mol Biol Rev. 2000 Mar; 64(1):202-36.
[Microbiol Mol Biol Rev. 2000]Biol Chem. 2006 Dec; 387(12):1535-44.
[Biol Chem. 2006]Proc Natl Acad Sci U S A. 2008 Jun 10; 105(23):8102-7.
[Proc Natl Acad Sci U S A. 2008]J Bacteriol. 1999 Nov; 181(21):6591-9.
[J Bacteriol. 1999]Mol Microbiol. 1999 May; 32(4):883-6.
[Mol Microbiol. 1999]Cell. 2004 Jan 9; 116(1):25-38.
[Cell. 2004]Nucleic Acids Res. 2005; 33(12):3875-96.
[Nucleic Acids Res. 2005]Nature. 2008 May 1; 453(7191):120-3.
[Nature. 2008]J Biol Chem. 2001 Jul 20; 276(29):27266-71.
[J Biol Chem. 2001]Arch Microbiol. 2004 Nov; 182(5):404-13.
[Arch Microbiol. 2004]Mol Microbiol. 2007 Jan; 63(1):166-76.
[Mol Microbiol. 2007]Arch Microbiol. 2008 Sep; 190(3):379-94.
[Arch Microbiol. 2008]Int J Syst Evol Microbiol. 2007 Apr; 57(Pt 4):803-8.
[Int J Syst Evol Microbiol. 2007]J Bacteriol. 2007 Jun; 189(11):4108-19.
[J Bacteriol. 2007]Proc Natl Acad Sci U S A. 2008 Jun 3; 105(22):7851-6.
[Proc Natl Acad Sci U S A. 2008]Archaea. 2002 Mar; 1(1):9-18.
[Archaea. 2002]Mol Microbiol. 2007 Jan; 63(1):166-76.
[Mol Microbiol. 2007]Arch Microbiol. 2008 Sep; 190(3):395-408.
[Arch Microbiol. 2008]Nat Rev Microbiol. 2006 Jul; 4(7):537-47.
[Nat Rev Microbiol. 2006]Nature. 2005 Feb 3; 433(7025):537-41.
[Nature. 2005]J Bacteriol. 2007 Feb; 189(3):772-8.
[J Bacteriol. 2007]Int J Syst Evol Microbiol. 2000 Nov; 50 Pt 6():2093-100.
[Int J Syst Evol Microbiol. 2000]Arch Microbiol. 2008 Sep; 190(3):395-408.
[Arch Microbiol. 2008]Microbiology. 2004 Nov; 150(Pt 11):3527-46.
[Microbiology. 2004]Structure. 2000 Dec 15; 8(12):1299-308.
[Structure. 2000]Adv Protein Chem. 1996; 48():311-39.
[Adv Protein Chem. 1996]J Bacteriol. 2008 Mar; 190(5):1743-50.
[J Bacteriol. 2008]J Bacteriol. 2007 Jun; 189(11):4108-19.
[J Bacteriol. 2007]Mol Microbiol. 2004 Apr; 52(2):515-27.
[Mol Microbiol. 2004]Microbiol Mol Biol Rev. 2007 Mar; 71(1):97-120.
[Microbiol Mol Biol Rev. 2007]Proc Natl Acad Sci U S A. 2008 Jun 3; 105(22):7851-6.
[Proc Natl Acad Sci U S A. 2008]J Bacteriol. 2007 Jun; 189(11):4108-19.
[J Bacteriol. 2007]Arch Microbiol. 2003 Mar; 179(3):160-73.
[Arch Microbiol. 2003]Int J Syst Evol Microbiol. 2007 Apr; 57(Pt 4):803-8.
[Int J Syst Evol Microbiol. 2007]J Bacteriol. 2004 Nov; 186(22):7754-62.
[J Bacteriol. 2004]Proc Natl Acad Sci U S A. 2008 Jun 3; 105(22):7851-6.
[Proc Natl Acad Sci U S A. 2008]Biochim Biophys Acta. 2002 May 20; 1597(1):74-80.
[Biochim Biophys Acta. 2002]FEBS Lett. 2005 Apr 25; 579(11):2319-22.
[FEBS Lett. 2005]J Bacteriol. 2002 Jul; 184(14):3975-83.
[J Bacteriol. 2002]Biochem Biophys Res Commun. 2001 Mar 30; 282(2):589-94.
[Biochem Biophys Res Commun. 2001]J Bacteriol. 2007 Jun; 189(11):4108-19.
[J Bacteriol. 2007]Antonie Van Leeuwenhoek. 1994; 66(1-3):247-70.
[Antonie Van Leeuwenhoek. 1994]J Bacteriol. 1996 Oct; 178(20):5890-6.
[J Bacteriol. 1996]Arch Microbiol. 2003 Mar; 179(3):160-73.
[Arch Microbiol. 2003]Proc Natl Acad Sci U S A. 2008 Jun 3; 105(22):7851-6.
[Proc Natl Acad Sci U S A. 2008]Bacteriol Rev. 1977 Mar; 41(1):100-80.
[Bacteriol Rev. 1977]Science. 2007 Sep 14; 317(5844):1534-7.
[Science. 2007]Archaea. 2007 May; 2(2):127-35.
[Archaea. 2007]Microbiology. 2003 Sep; 149(Pt 9):2357-71.
[Microbiology. 2003]Adv Microb Physiol. 2006; 51():1-71.
[Adv Microb Physiol. 2006]Biochim Biophys Acta. 2002 Sep 10; 1555(1-3):154-9.
[Biochim Biophys Acta. 2002]Int J Syst Evol Microbiol. 2007 Apr; 57(Pt 4):803-8.
[Int J Syst Evol Microbiol. 2007]Int J Syst Evol Microbiol. 2000 Nov; 50 Pt 6():2093-100.
[Int J Syst Evol Microbiol. 2000]J Inorg Biochem. 2006 May; 100(5-6):1015-23.
[J Inorg Biochem. 2006]Proc Natl Acad Sci U S A. 2003 Oct 28; 100(22):12984-8.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 2006 Oct 3; 103(40):14750-5.
[Proc Natl Acad Sci U S A. 2006]Arch Microbiol. 2008 Sep; 190(3):379-94.
[Arch Microbiol. 2008]Archaea. 2002 Mar; 1(1):9-18.
[Archaea. 2002]Biochem Soc Trans. 2004 Apr; 32(Pt 2):199-203.
[Biochem Soc Trans. 2004]J Bacteriol. 2008 Mar; 190(5):1743-50.
[J Bacteriol. 2008]Arch Microbiol. 2008 Sep; 190(3):395-408.
[Arch Microbiol. 2008]Int J Syst Evol Microbiol. 2007 Apr; 57(Pt 4):803-8.
[Int J Syst Evol Microbiol. 2007]Archaea. 2002 Mar; 1(1):9-18.
[Archaea. 2002]Biochem Soc Trans. 2004 Apr; 32(Pt 2):199-203.
[Biochem Soc Trans. 2004]Int J Syst Evol Microbiol. 2007 Apr; 57(Pt 4):803-8.
[Int J Syst Evol Microbiol. 2007]Nucleic Acids Res. 2005 Jul 1; 33(Web Server issue):W686-9.
[Nucleic Acids Res. 2005]Nucleic Acids Res. 2008 Jan; 36(Database issue):D528-33.
[Nucleic Acids Res. 2008]Nucleic Acids Res. 2008 Jan; 36(Database issue):D528-33.
[Nucleic Acids Res. 2008]Proc Natl Acad Sci U S A. 1999 Mar 16; 96(6):2896-901.
[Proc Natl Acad Sci U S A. 1999]Proteins. 2008 Feb 15; 70(3):611-25.
[Proteins. 2008]Biol Direct. 2007 Nov 27; 2():33.
[Biol Direct. 2007]Nucleic Acids Res. 2001 Jan 15; 29(2):545-52.
[Nucleic Acids Res. 2001]Nucleic Acids Res. 2007 Jul; 35(Web Server issue):W645-8.
[Nucleic Acids Res. 2007]BMC Evol Biol. 2006 Mar 24; 6():29.
[BMC Evol Biol. 2006]Syst Biol. 2003 Oct; 52(5):696-704.
[Syst Biol. 2003]Proc Natl Acad Sci U S A. 2005 Dec 13; 102(50):17934-9.
[Proc Natl Acad Sci U S A. 2005]