Table 1 Recently completed microbial genomes (August–September 2008). |
![]() |
Formats:
|
||||||
Copyright Journal compilation © 2008 Society for Applied Microbiology and Blackwell Publishing Ltd Sorting out the mix in microbial genomics National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA *For correspondence. E-mail galperin/at/ncbi.nlm.nih.gov; Tel. (+1) 301 435 5910; Fax (+1) 301 435 7793. Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation. The relatively small number of microbial genomes completed in the past two months (Table 1) includes, however, representatives of two new bacterial phyla, Dictyoglomi and Nitrospirae. To highlight the first genome sequences from these poorly studied taxa, they have been placed in a new section at the top of Table 1.
So far, little is known about either Dictyoglomus thermophilum or Thermodesulfovibrio yellowstonii. Both are Gram-negative thermophilic heterotrophs with an extremely low (29 mol%) G+C content of their chromosomal DNA (Saiki et al., 1985; Henry et al., 1994). Dictyoglomus thermophilum is an obligate anaerobe that grows optimally at 70–73°C. It was isolated from the Tsuetate hot spring in Japan (Saiki et al., 1985) and used to purify three extremely heat-stable amylases (Horinouchi et al., 1988). Thermodesulfovibrio yellowstonii was isolated from a thermal vent in Yellowstone Lake in Wyoming. It contains c-type cytochromes and grows optimally at 65°C using lactate, pyruvate or formate plus acetate as substrates and can use sulfate, thiosulfate and sulfite as terminal electron acceptors (Henry et al., 1994). Analysis of these genomes should provide a window into the physiology and evolutionary relationships of these new bacterial lineages. Completion of these two genomes is a major step towards the goal of having at least one complete genome sequence from representatives of all major prokaryotic groups. Indeed, we now have at least one completely sequenced genome for 18 bacterial phyla out of the 24 listed in the taxonomic outline in the socond volume of the Bergey's Manual (Garrity et al., 2004; available at http://www.bergeys.org/outlines.html). Representatives of five more bacterial phyla are at various stages of genome sequencing: Chrysiogenes arsenatis (phylum Chrysiogenetes), Denitrovibrio acetiphilus (Deferribacteres), Fibrobacter succinogenes (Fibrobacteres), Thermodesulfobacterium commune (Thermodesulfobacteria) and Thermomicrobium roseum (Thermomicrobia, which may be considered a class in the phylum Chloroflexi). Only one of those 24 phyla (Gemmatimonadetes) remains not a subject of any publicly announced genome sequencing project. This is obvious progress in coverage of microbial diversity compared with the status of the genome sequencing just two years ago (Galperin, 2006), However, this nice and clear picture of the bacterial taxonomy and the corresponding genome sequencing efforts is complicated by several different factors. First of all, high-rank bacterial taxonomy is still in the state of flux: new candidate phyla are being identified and new genome sequencing projects are being planned to characterize their representatives. There are ongoing sequencing projects for Lentisphaera araneosa and Thermanaerovibrio acidaminovorans, cultured representatives, respectively, of the recently recognized phyla Lentisphaerae (Cho et al., 2004) and Synergistetes (Aminanaerobia) (Hongoh et al., 2007; Jumas-Bilak et al., 2007). In addition, genomic sequencing is being performed on candidate phyla that were initially deduced based solely on the clustering of 16S rRNA sequences. Examples include a nearly complete genome from the candidate phylum TM7, which still has no cultivated members (Marcy et al., 2007) and two recently sequenced genomes from representatives of the candidate phylum Termite group 1 (TG1), one of which, “Elusimicrobium minutum”, was in the meantime successfully cultivated. Sometimes genomic data reveal distant similarities between two or more phyla, which results in their unification into a group (e.g. Bacteroidetes/Chlorobi, Fibrobacteres/Acidobacteria) or a superphylum, e.g. Chlamydiae/Verrucomicrobia/Planctomycetes/Lentisphaerae (Wagner and Horn, 2006; Hou et al., 2008). Besides, certain validly described bacterial groups still lack any sequence information (Yarza et al., 2008). Finally, there are several alternative classifications of bacteria that made their way into taxonomic literature but, for a variety of reasons, failed to gain acceptance in the community (Gupta, 1998; 2000; Cavalier-Smith, 2002; 2006). Another such example is the already mentioned (Galperin, 2008) recent transfer of Mollicutes from the phylum Firmicutes into a new phylum Tenericutes in the latest edition of Bergey's Manual (Ludwig et al., 2008). The phylogenetic trees that served as the rationale for that move show numerous inconsistencies and hardly justify the decision to create this new phylum. It must be noted that back in 1992, Sneath and Brenner stated ‘There is no such thing as an official classification’ (see http://www.bacterio.cict.fr/Sneath-Brenner.html). This point was recently reiterated by J.P. Euzéby, whose List of Prokaryotic names with Standing in Nomenclature (http://www.bacterio.cict.fr/) includes an up-to-date listing of commonly recognized prokaryotic phyla (http://www.bacterio.cict.fr/classifphyla.html). For a quick look at the current state of microbial genome sequencing, the easiest tool might be the NCBI's Tax Tree (http://www.ncbi.nlm.nih.gov/genomes/MICROBES/microbial_taxtree.html), which lists both completed and ongoing genome sequencing projects. However, for those interested in the emerging microbial diversity, the best source of information is probably the ‘greengenes’ website (http://greengenes.lbl.gov), which includes six different classification schemes, from the most conservative ones (the Ribosomal Database project) to the schemes by Pace and Hugenholtz that include up to 100 bacterial lineages (see http://greengenes.lbl.gov/Download/Taxonomic_Outlines/five_way_venn.png for a nice graphical representation of the relation between these schemes). Among archaea, Bergey's taxonomic outline recognized two phyla, Crenarchaeota and Euryarchaeota, which are now represented by 50 complete genomes (16 and 34 respectively). In addition, there are two newly suggested but not yet officially recognized phyla, each represented by a single genome, “Candidatus Korarchaeum cryptofilum” (Korarchaeota) and Nanoarchaeum equitans (Nanoarchaeota). Again, classifications by Pace and Hugenholtz at http://greengenes.lbl.gov include up to 40 archaeal lineages. Among eukaryotic microorganisms, few genomes have been sequenced to any significant degree, and most of these genomes represent just a handful of lineages: Amoebozoa (Dictyostelium, Entamoeba), Apicomplexans (Cryptosporidium, Plasmodium, Toxoplasma), Ciliophora (Paramecium, Tetrahymena) and Kinetoplasts (Leishmania, Trypanosoma). There are partially sequenced representatives of choanoflagellates (Monosiga), diplomonads (Giardia), parabasalids (Trichomonas), rhodophytes (Cyanidioschyzon) and stramenopiles (Phytophtora, Thalassiosira). Several more eukaryotic lineages (Apusozoa, Haptophyceae Heterolobosea) are going to be covered by ongoing sequencing projects, but the overall coverage of major eukaryotic groups is extremely poor and will remain that way at least for the next several years. In summary, recent genomic projects are successfully covering the diversity of cultured bacteria and archaea. Any significant coverage of the main eukaryotic lineages or of prokaryotic lineages identified through cultivation-independent methods will remain the challenge for years to come. In eukaryotic genomics, the biggest news was the completion of the genome sequences of two malaria parasites, Plasmodium knowlesi and Plasmodium vivax. Their back-to-back publication in Nature (Carlton et al., 2008; Pain et al., 2008) was accompanied by an excellent commentary (Winzeler, 2008) on the contribution of the genomic data to the progress of malaria research in the 6 years that have passed since the completion of the genomes of Plasmodium falciparum (Gardner et al., 2002) and Plasmodium yoelii (Carlton et al., 2002). Coprothermobacter proteolyticus (formerly known as Thermobacteroides proteolyticus) is a moderately thermophilic proteolytic bacterium, originally isolated from the fermentation sludge of an anaerobic digester treating cattle manure mixed with tannery waste (Ollivier et al., 1985). It can grow at temperatures of up to 75°C with an optimum at 63°C. Although C. proteolyticus was initially described as a Gram-negative bacterium and therefore suggested to belong to a deep bacterial lineage, potentially at the phylum level (Rainey and Stackebrandt, 1993), analysis of its 16S rRNA revealed that it is related to Thermoanaerobacter sp. It is currently assigned to the family Thermodesulfobiaceae (Mori et al., 2003) within the order Thermoanaerobacterales and is the first sequenced genome from that family. Phenylobacterium zucineum is an α-proteobacterium recently isolated from a human erythroleukemia cell line (Zhang et al., 2007). All close relatives of P. zucineum are free-living environmental organisms, and its 4.4 Mb genome is much larger than that of any intracellular parasite or symbiont characterized so far. Indeed, the genome sequence (Luo et al., 2008) revealed similarities with the genome of Caulobacter crescentus. However, fragments of P. zucineum genomic DNA were found among the EST libraries from breast cancer and lymphatic cell lines, suggesting that this organism might survive in proliferative tissues. Acidithiobacillus ferrooxidans (previously known as Thiobacillus ferrooxidans; Kelly and Wood, 2000), is an obligately acidophilic chemolithoautotrophic γ-proteobacterium, a popular model organism to study bacterial membrane energetics at acidic pH values (see Ferguson and Ingledew, 2008 for a recent review). It gains energy by oxidizing ferrous iron and is able to grow in the pH range from 1.3 to 4.0 using CO2 as the sole source of carbon. Acidithiobacillus ferrooxidans is a major component of microbial consortia used in bio-mining to extract copper, zinc and other metals from low-grade ores. With the recent increase in the price of gold, A. ferrooxidans-based microbial consortia are increasingly used to improve recovery of gold from arsenopyrite ores. Despite the importance of this organism (or maybe because of it), sequencing of the A. ferrooxidans genome had a long and convoluted history. The first (incomplete or ‘gapped’) genome sequence of the type strain A. ferrooxidans ATCC 23270 was produced at the Integrated Genomics in 1999. It consisted of 1353 contigs covering 2611 kb and coding for 2712 proteins; it was estimated to lack ~100 kb (Selkov et al., 2000). This sequence was used for an analysis of the amino acid metabolism in A. ferrooxidans, which allowed an almost complete reconstruction of its metabolic pathways, leaving just 10 unassigned (missing) enzymes. Despite the initial intent of the authors to demonstrate that ‘gapped’ microbial genomes were almost as good as complete ones (Selkov et al., 2000), this paper actually succeeded in proving the opposite: a meaningful and unequivocal analysis is only possible with a complete genome sequence. Furthermore, only small pieces of the genome have been submitted to the GenBank, which prevented others from analysing this genome. Shortly after that, sequencing of A. ferrooxidans genome has been undertaken at the Institute of Genomic Research (TIGR). The resulting incomplete genome sequence of 3081 kb was made publicly available in 2001 as RefSeq entry NC_002923 and was subsequently used for a variety of genome analyses (e.g. Valdés et al., 2003; Quatrini et al., 2007). Over the next two years, this sequence was updated more than a dozen times and was finally withdrawn at the end of 2003. Since 2006, a complete genome sequence of 2982 kb coding for 3217 predicted proteins has been available on the TIGR website but was not submitted to GenBank. Finally, a recent joint paper by Chilean and TIGR scientists (Valdés et al., 2008) reported a detailed analysis of this genome and its availability to the public. Meanwhile, JGI scientists have released a 2885 kb genome sequence of another strain of A. ferrooxidans, which encodes 2826 proteins. This strain A. ferrooxidans ATCC 53993 was isolated from mine water of the Alaverda copper deposit in Armenia and initially assigned the name Leptospirillum ferrooxidans (Balashova et al., 1974; Hippe, 2000). Although its relation to the type strain ATCC 23270 is not known at this time, their 16S rRNA sequences are 100% identical. Thus, after 10 years of struggling with unfinished genome sequences, the public now has access to two complete genomes of A. ferrooxidans. This should allow further analyses of the properties of this remarkable organism and stimulate its use in energy research and bio-mining. The list of completely sequenced spirochaete genomes has grown to include genomes of Borrelia duttonii and Borrelia recurrentis (Lescot et al., 2008). Both organisms are important human pathogens causing relapsing fevers. The first one is transmitted by the tick Ornithodoros moubata and is found primarily in east Africa. Borrelia recurrentis is transmitted by human body lice Pediculus humanus and is found in around the world. The sequenced strain B. duttonii Ly was isolated from a 2-year-old girl with tick-borne relapsing fever in Tanzania, whereas B. recurrentis strain A1 was isolated from an adult patient with louse-borne relapsing fever in Ethiopia. Klebsiella pneumoniae ssp. pneumoniae is a well-known human pathogen, and the first genome of its clinical isolate MGH 78578 was sequenced more than two years ago. A very interesting paper from the JCVI scientists now reports the genome sequence of an environmental N2-fixing strain of K. pneumoniae (Fouts et al., 2008). Such strains are commonly found as endophytes that colonize tissues of rice, maize, sugarcane, banana and various grasses and improve the growth of the host plants by supply them with ammonia. The sequenced strain K. pneumoniae 342 was isolated from maize and later shown to colonize wheat and alfalfa sprouts. Comparative analysis of the two strains provides interesting clues to the adaptation to the endophytic lifestyle, as well as into the evolution of pathogenicity in K. pneumoniae. The list of organisms with recently sequenced genomes also includes the marine γ-proteobacteria Alteromonas macleodii and Vibrio fischeri, δ-proteobacteria Anaeromyxobacter sp. and Geobacter bemidjiensis, five new strains of Salmonella enterica ssp. enterica that include four new serovars (Thomson et al., 2008), Streptococcus equi ssp. zooepidemicus (also known as Streptococcus zooepidemicus), the cause of an acute nephritis acute epidemic in Brazil (Beres et al., 2008), and the well-studied Helicobacter pylori strain G27 (Table 1). Acknowledgments M.Y.G. is supported by the Intramural Research Program of the NIH, National Library of Medicine. The author's opinions do not reflect the views of NCBI, NLM or the National Institutes of Health. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||
Arch Microbiol. 1994 Jan; 161(1):62-9.
[Arch Microbiol. 1994]Eur J Biochem. 1988 Sep 15; 176(2):243-53.
[Eur J Biochem. 1988]Environ Microbiol. 2006 Aug; 8(8):1313-7.
[Environ Microbiol. 2006]Environ Microbiol. 2004 Jun; 6(6):611-21.
[Environ Microbiol. 2004]Appl Environ Microbiol. 2007 Oct; 73(19):6270-6.
[Appl Environ Microbiol. 2007]Int J Syst Evol Microbiol. 2007 Dec; 57(Pt 12):2743-8.
[Int J Syst Evol Microbiol. 2007]Proc Natl Acad Sci U S A. 2007 Jul 17; 104(29):11889-94.
[Proc Natl Acad Sci U S A. 2007]Curr Opin Biotechnol. 2006 Jun; 17(3):241-9.
[Curr Opin Biotechnol. 2006]Nature. 2008 Oct 9; 455(7214):757-63.
[Nature. 2008]Nature. 2008 Oct 9; 455(7214):799-803.
[Nature. 2008]Nature. 2008 Oct 9; 455(7214):751-6.
[Nature. 2008]Nature. 2002 Oct 3; 419(6906):498-511.
[Nature. 2002]Nature. 2002 Oct 3; 419(6906):512-9.
[Nature. 2002]Extremophiles. 2003 Aug; 7(4):283-90.
[Extremophiles. 2003]Syst Appl Microbiol. 2007 Apr; 30(3):207-12.
[Syst Appl Microbiol. 2007]BMC Genomics. 2008 Aug 13; 9():386.
[BMC Genomics. 2008]Int J Syst Evol Microbiol. 2000 Mar; 50 Pt 2():511-6.
[Int J Syst Evol Microbiol. 2000]Proc Natl Acad Sci U S A. 2000 Mar 28; 97(7):3509-14.
[Proc Natl Acad Sci U S A. 2000]BMC Genomics. 2003 Dec 15; 4(1):51.
[BMC Genomics. 2003]Nucleic Acids Res. 2007; 35(7):2153-66.
[Nucleic Acids Res. 2007]Int J Syst Evol Microbiol. 2000 Mar; 50 Pt 2():501-3.
[Int J Syst Evol Microbiol. 2000]PLoS Genet. 2008 Sep 12; 4(9):e1000185.
[PLoS Genet. 2008]PLoS Genet. 2008 Jul 25; 4(7):e1000141.
[PLoS Genet. 2008]Genome Res. 2008 Oct; 18(10):1624-37.
[Genome Res. 2008]PLoS One. 2008 Aug 21; 3(8):e3026.
[PLoS One. 2008]