• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Aug 29, 2006; 103(35): 13126–13131.
Published online Aug 21, 2006. doi:  10.1073/pnas.0605709103
PMCID: PMC1551899

The cyanobacterial genome core and the origin of photosynthesis


Comparative analysis of 15 complete cyanobacterial genome sequences, including “near minimal” genomes of five strains of Prochlorococcus spp., revealed 1,054 protein families [core cyanobacterial clusters of orthologous groups of proteins (core CyOGs)] encoded in at least 14 of them. The majority of the core CyOGs are involved in central cellular functions that are shared with other bacteria; 50 core CyOGs are specific for cyanobacteria, whereas 84 are exclusively shared by cyanobacteria and plants and/or other plastid-carrying eukaryotes, such as diatoms or apicomplexans. The latter group includes 35 families of uncharacterized proteins, which could also be involved in photosynthesis. Only a few components of cyanobacterial photosynthetic machinery are represented in the genomes of the anoxygenic phototrophic bacteria Chlorobium tepidum, Rhodopseudomonas palustris, Chloroflexus aurantiacus, or Heliobacillus mobilis. These observations, coupled with recent geological data on the properties of the ancient phototrophs, suggest that photosynthesis originated in the cyanobacterial lineage under the selective pressures of UV light and depletion of electron donors. We propose that the first phototrophs were anaerobic ancestors of cyanobacteria (“procyanobacteria”) that conducted anoxygenic photosynthesis using a photosystem I-like reaction center, somewhat similar to the heterocysts of modern filamentous cyanobacteria. From procyanobacteria, photosynthesis spread to other phyla by way of lateral gene transfer.

Keywords: cyanobacteria, protein families, lateral gene transfer

Cyanobacteria are one of the earliest branching groups of organisms on this planet (1, 2). They are the only known prokaryotes to carry out oxygenic photosynthesis, and there is little doubt that they played a key role in the formation of atmospheric oxygen ≈2.3 Gyr ago (2). Despite its evolutionary, environmental, and geochemical importance, many aspects of cyanobacterial cell life remain obscure (35). Genome sequencing opened a new chapter in cyanobacterial research. In the last few years, complete genome sequences of several freshwater and marine cyanobacteria became available, providing ample data for systematic analysis. A comparison of the complete genomes from three different strains of Prochlorococcus spp. demonstrated a wide variety of gene complements within this genus due to massive genome reduction in some lineages (6, 7). Studies of the genes shared by cyanobacteria and other photosynthetic organisms allowed delineation of the “photosynthetic gene set” and demonstrated a significant extent of lateral gene transfer (LGT) among phototrophic bacteria (811). A somewhat surprising result of the latter work has been that genes for most proteins involved in photosynthesis (hereafter “photosynthetic genes”) were not in the photosynthetic gene set.

We compared proteins encoded in 15 complete cyanobacterial genomes, including five genomes of Prochlorococcus spp., to define the minimal set of genes common to all cyanobacteria and to trace the conservation of these genes among other taxa. We analyzed the phylogenetic affinities of genes in this set and identified previously unrecognized candidate photosynthetic genes. We further used this gene set to address the identity of the first phototrophs, a subject of intense discussion in recent years (8, 9, 1233). We show that cyanobacteria and plants share numerous photosynthesis-related genes that are missing in genomes of other phototrophs. This observation suggests, in agreement with geological evidence, that (now extinct) anoxygenic ancestors of cyanobacteria are the most plausible candidates for the ancestral photoautotrophs, which apparently disseminated parts of their photosynthetic apparatus to other bacteria by way of LGT.


Common and Unique Protein Families in Cyanobacteria.

Clustering of proteins encoded in the 15 complete cyanobacterial genomes yielded 3,188 protein families [cyanobacterial clusters of orthologous groups of proteins (CyOGs)] with members encoded in at least three genomes. Of these CyOGs, 892 were encoded in each cyanobacterial genome, and 162 more were encoded in 14 of 15 genomes (Table 2, which is published as supporting information on the PNAS web site). The combined set of 1,054 CyOGs that are missing in no more than one cyanobacterial genome is hereafter referred to as the core CyOGs. Predictably, cyanobacteria with small genomes are over-represented in the core CyOGs compared with species with larger genomes. Thus, core CyOGs include 52–66% of all proteins encoded in Prochlorococcus spp. but only 25% of Anabaena sp. PCC 7120 proteins (Table 3, which is published as supporting information on the PNAS web site).

Analysis of CyOGs that apparently had no members in one or more cyanobacterial genomes revealed 31 (mostly short) proteins that are encoded in the respective genomes but were not called by gene-finding programs, such as subunits VI (PetL) and VII (PetM) of the cytochrome b6f complex (34). We also found five full-length genes that were annotated as pseudogenes in the original genome submissions and whose products were not included in the protein database (Table 4, which is published as supporting information on the PNAS web site).

The stringent criteria used to define the core CyOGs led to the exclusion of many previously characterized cyanobacterial proteins. Because marine picocyanobacteria are unicellular, proteins that are involved in filament formation and heterocyst differentiation (3, 4) did not make it into the core set. Certain components of photosystems I (PSI) and II (PSII) are also missing from the core set. For example, the 12-kDa extrinsic subunit, PsbU, and a low-potential cytochrome c550, PsbV, which both contribute to stabilization of the oxygen-evolving complex, are missing in four Prochlorococcus genomes (35). In contrast, PSI components PsaI, PsaJ, and PsaK and PSII component PsbZ, which are missing in the thylakoid-less cyanobacterium Gloeobacter violaceus (36, 37), are found in all other cyanobacterial genomes and hence were included in the core set, as was plastocyanin, the electron donor to PSI, which is missing in Thermosynechococcus elongatus (Table 1). Owing to the poor representation of genes involved in environmental sensing and signal transduction in the genomes of marine picocyanobacteria, most likely due to their adaptation to nutrient-poor and relatively constant oceanic environments (35, 38), there are few regulatory genes in the core set. In 85 of the 162 core CyOGs that lacked representatives from a single organism, that organism was G. violaceus. Proteins from one of the Prochlorococcus strains were missing in 31 core CyOGs, the thermophile T. elongatus was missing in 22 core CyOGs, and Synechocystis sp. was missing in 20 core CyOGs (Table 3).

Table 1.
Distribution of photosynthesis-related genes in genomes of phototrophic bacteria

Most core CyOGs comprise tight clusters, with various cyanobacterial proteins showing much higher similarity to each other than to any proteins from other organisms (Fig. 2, which is published as supporting information on the PNAS web site). However, certain proteins are only distantly related to other members of the CyOG and might represent examples of relatively recent LGT in the corresponding lineage. Examples include G. violaceus genes for arginyl-tRNA synthetase glr4279, chorismate synthase glr3393, and γ-glutamyl phosphate reductase gll3923, Synechocystis sp. genes for 6-pyruvoyl-tetrahydropterin synthase slr0078, α- (slr1239) and β- (slr1434) subunits of NAD/NADP transhydrogenase, and many others.

Phylogenetic Affinities of the Core Cyanobacterial Genes.

Of the 1,054 core CyOGs, 936 are shared with other bacteria. This set includes primarily housekeeping proteins that are involved in DNA replication and repair, transcription, translation, key metabolic pathways, and energy metabolism. Approximately 50 core CyOGs shared with other bacteria are formed by “conserved hypothetical” proteins whose functions are unknown and cannot be predicted from sequence similarity (39). Almost one-third of the families that are shared with other bacteria (291 CyOGs) are also encoded in plant genomes. In addition to the ribosomal proteins and other components of the chloroplast transcription and translation machinery, this list includes enzymes of heme biosynthesis, subunits of the respiratory complexes I (NADH dehydrogenase) and III (cytochrome c oxidase), and F0F1-ATP synthase, as well as subunits of the cytochrome b6f complex (Table 2).

Eighty-four core CyOGs are shared exclusively with plants, such as Arabidopsis thaliana and Oryza sativa, the red alga Cyanidioschyzon merolae, and the diatom Thalassiosira pseudonana. Approximately half of these proteins have known functions and participate in photosynthesis as components of PSI, PSII, light-harvesting systems, or members of the high-light-inducible protein (HLIP)/early light-inducible protein (ELIP) superfamily (Table 1). Thirty-five CyOGs with the same phylogenetic profile (i.e., encoded in at least 14 cyanobacterial genomes and in at least some chloroplast-containing eukaryotes, but not in any other of the >350 prokaryotic and eukaryotic genomes) have no known function or have general function only (Table 6, which is published as supporting information on the PNAS web site). This profile suggests that the functions of these proteins are related to photosynthesis. Indeed, several recently characterized proteins with similar phylogenetic profiles turned out to participate in chlorophyll biosynthesis (40), photosynthesis (41), and light-driven NAD(P) reduction (42, 43).

Unusual Phylogenetic Profiles.

Although the great majority of core CyOGs are shared with other bacteria and/or plants, some were also found in other eukaryotes that carry vestigial plastids, such as apicoplasts in Plasmodium falciparum and other apicomplexans. The proteins of likely cyanobacterial origin that are shared by apicomplexans and other plastid-carrying eukaryotes but that are missing in other eukaryotes include enzymes of the deoxyxylulose pathway of terpenoid biosynthesis, fatty acid biosynthesis, DNA gyrase, peptide deformylase, and several others. Most of these proteins are validated targets for antimalarial drugs (44, 45). Other proteins with similar phylogenetic profiles, such as translation initiation factor IF-1 (InfA), phosphoenolpyruvate carboxylase, and several poorly characterized proteins (e.g., seed maturation protein PM23) are also likely to function in apicoplasts and might merit exploration as additional drug targets. Another interesting group includes genes that are found exclusively in phototrophs, for example, sll0608 and sll0609 (ycf49), which are encoded in all cyanobacteria and are shared with plants and representatives of Chlorobi.

Cyanobacterial Synapomorphies.

In addition to tight clustering of their core proteins (Fig. 2), cyanobacteria possess other features identifying them as members of a distinct phylogenetic lineage (clade). The apparent cyanobacterial synapomorphies (unique features shared by members of a clade) include 50 core CyOGs that do not have close homologs in other organisms (see Materials and Methods). These CyOGs are listed in Table 7, which is published as supporting information on the PNAS web site. Remarkably, the function of only one of these genes is known: SomA (Slr0042) is an outer membrane porin (46). Although functions of the other synapomorphic core CyOGs remain unknown, some of their members are expressed under stress conditions, e.g., Sll1507, Slr1160, and Slr1915 are induced by high salt (47). Conservation of these proteins in (almost) all cyanobacteria makes them attractive targets for future experimental studies (see ref. 39).

Photosynthetic Genes in the Conserved Cyanobacterial/Plant Core.

The vast majority of cyanobacterial photosynthetic genes had no detectable homologs in anoxygenic phototrophic bacteria (Table 1). Genomes of two such phototrophs, the purple α-proteobacterium Rhodopseudomonas palustris and the green sulfur bacterium Chlorobium tepidum, have been sequenced, allowing us to establish orthologous relationships between their genes and those of cyanobacteria (48, 49). In addition, the genome of the green nonsulfur bacterium Chloroflexus aurantiacus was released by the Department of Energy Joint Genome Institute in an unfinished form, and the genome of the Gram-positive phototrophic bacterium Heliobacillus mobilis has been sequenced by Integrated Genomics, Inc. (Chicago, IL) and was kindly made available to us for some BLAST searches. Although these phototrophic bacteria and cyanobacteria share numerous typically bacterial proteins (8, 9, 11), we found that anoxygenic photosynthetic bacteria possess very few photosynthetic genes shared by cyanobacteria and plants; furthermore, even these shared genes differ in different bacteria (Table 1). Of the seven groups of cyanobacterial genes that are directly related to photosynthesis, only some genes of (bacterio)chlorophyll biosynthesis are shared by all prokaryotic phototrophs.

For example, whereas all cyanobacteria encode the full set of enzymes of the Calvin–Benson–Bassham cycle, the Chlorobium and Heliobacillus genomes lack the genes for phosphoribulokinase and ribose-5-phosphate isomerase, as well as the gene encoding the small subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO) (Table 1). These organisms encode proteins similar to the large subunit of RubisCO; however, these are likely to participate in methionine salvage, rather than in CO2 fixation (50, 51). Autotrophic CO2 fixation by Chloroflexus is known to occur by way of the 3-hydroxypropionate cycle (52). This finding leaves R. palustris as the only anoxygenic phototroph in Table 1 to use the Calvin cycle.

Analysis of the unfinished genomes of two other phototrophs, β-proteobacterium Rubrivivax gelatinosus and γ-proteobacterium Thermochromatium tepidum, revealed a pattern of presence and absence of key photosynthetic genes that was very similar to that of R. palustris (data not shown) and is likely to be common to all purple phototrophic bacteria.


The “Core Set” Versus the “Genomic Signature.”

Owing to the high level of sequence conservation among orthologous proteins from different cyanobacteria (Fig. 2), delineation of cyanobacterial gene clusters is a relatively straightforward task. In several earlier studies, this delineation was accomplished for such purposes as improved genome annotation (53), delineation of the cyanobacterial genomic signature (54), calculation of the number of cyanobacterial genes in plants (10), and tracing the evolution of the oxygen-evolving center of PSII (29). However, all these studies relied on arbitrarily set, usually conservative, threshold similarity values to infer orthology. As described previously, the cluster of orthologous groups (COG) approach, which does not depend on such thresholds, is more flexible and allows delineation of protein families with low, as well as high, levels of similarity (49). This procedure, however, can be used reliably only for complete genomes, which is why unfinished cyanobacterial genomes were not included in this work. Besides, certain CyOGs could contain homologous genes whose functions have diverged during evolution. For instance, homologous genes for phycoerythrin (cpeB) and phycocyanin (cpcB) β-chains included in CyOG00868 are most likely in paralogs (55) rather than true orthologs.

A comparison of eight genomes, two finished and six unfinished, has been used to delineate a genomic signature of 181 cyanobacteria-specific proteins (54). A comparison of the core CyOGs with this genome signature showed that 131 of the 181 signature protein families survived inclusion of genomes of three more strains of Prochlorococcus spp. and four more strains of Synechococcus spp. and were represented in the core CyOGs (Table 3). In contrast, our analysis identified 26 synapomorphic CyOGs that were not included in the cyanobacterial genome signature (54). The 50 protein families that did not make it into the core CyOGs were most often missing in G. violaceus and in one or two strains of Prochlorococcus. In addition, for at least 19 of the remaining 131 core CyOGs, close homologs have been found among recently sequenced bacterial or archaeal genomes. This finding is hardly surprising given the rapid growth of the protein database and the large scale of lateral gene transfer among various lineages. There is little doubt that the current list of 50 cyanobacteria-specific core CyOGs (Table 6) will soon shrink even further.

Evolution of Photosynthesis and Lateral Gene Transfer.

The availability of the cyanobacterial genome core allowed us to reassess the origin of (bacterio)chlorophyll-based photosynthesis, which, in addition to cyanobacteria, is found in the Bacteroidetes/Chlorobi group (e.g., C. tepidum), Firmicutes (e.g., H. mobilis), α-Proteobacteria (e.g., R. palustris), β-Proteobacteria (e.g., Rubrivivax gelatinosum), γ-Proteobacteria (e.g., Chromatium vinosum), and Chloroflexi (e.g., C. aurantiacus). The first two phyla have photosynthetic reaction centers (RCs) that are similar to the cyanobacterial PSI and that use low-potential FeS clusters as electron acceptors (RC1 type). The RCs of proteobacteria and Chloroflexi (RC2-type) use bound quinones as ultimate electron acceptors and are similar to the cyanobacterial PSII (although lacking the oxygen-evolving complex). It is generally believed that the evolution of photosynthetic genes was accompanied by their dissemination by way of LGT between different groups of bacteria (12, 26, 30, 31, 56). This idea is supported by the apparent presence of nonphotosynthetic representatives in all of these phyla, except for Cyanobacteria; by the fact that the photosynthesis-related proteins are often encoded on a single contiguous chromosomal region (superoperon) (20, 57); by several phylogenetic analyses (8, 11, 58); and by the observation that photosynthetic genes can be transduced by cyanophages (59).

The propensity of photosynthetic genes to be laterally transferred between distantly related organisms makes identification of the lineage that was the first to develop chlorophyll-based photosynthesis particularly challenging. The extremely small number of photosynthetic proteins that are shared between different photosynthetic bacteria (Table 1) forced the phylogenetic analyses to rely on surrogate protein sets, such as the enzymes involved in (bacterio)chlorophyll biosynthesis (14). These analyses contributed to the understanding of the evolution of (bacterio)chlorophyll (17) but did not purport to reflect evolution of photosynthesis in general. In contrast, the recent study of Xiong et al. (21) assumed that topology of the phylogenetic tree built for the (bacterio)chlorophyll biosynthesis enzymes is representative of the evolution of the photosynthetic machinery as a whole. Specifically, the authors’ observations that proteobacteria branched first in the tree were interpreted as evidence that photosynthesis originally evolved in purple bacteria (21). Others, however, were either unable to reproduce this result (31) or have not observed this topology in phylogenetic trees for any other genes (11). In addition, Green and Gantt (22) noted that (i) branching of proteobacterial genes at the root of the tree meant that they were the most divergent in the set, not necessarily that they branched off earlier, and (ii) “ancient” genes in modern proteobacteria could have originated elsewhere, and purple bacteria could have acquired them by way of LGT. Thus, even if the observations of Xiong et al. (21) were valid, their conclusions on the origin of photosynthesis among purple bacteria do not follow from their results.

The data in Table 1, which show that only some enzymes of (bacterio)chlorophyll biosynthesis are found in all phototrophs, together with the observations of extensive LGT and recombination in cyanobacterial genomes, limit the contribution of the standard tree-based approach to the problem of the origin of photosynthesis. Analysis of phylogenetic patterns of key photosynthetic proteins might be more informative for this purpose.

Which Bacteria Were the First Phototrophs?

In the past few years, photosynthesis has been proposed to have emerged in Heliobacillus (16, 27), Chlorobium (13), Chloroflexus (15), or proteobacterial (21) lineages (reviewed in refs. 24, 26, 30, and 32). Although the arguments in favor of proteobacteria do not appear valid (see above), there seems to be some support for each of the other candidates. Thus, apparently primitive homodimeric RCs of type I are found in Chlorobium (13) and Heliobacillus (60), whereas Chloroflexus is believed to be an early-branching lineage of phototrophs (15). Cyanobacteria are usually not considered explicitly as a lineage in which photosynthesis could have emerged because of the far greater complexity of their photosynthetic machinery. This fact, however, can be interpreted both ways. Indeed, the total number of genes involved in photosynthesis in cyanobacteria is much greater than that in any of the other prokaryotic phototrophs (Table 1). Only cyanobacteria possess photosynthetic reaction centers of both types, RC1 and RC2, and, in addition to chlorophyll- and phycobilin-containing light-harvesting systems, have chlorophyll-binding proteins whose function is believed to be dissipation of light energy to prevent photodamage (HLIPs; see Table 1). Thus, the majority of photosynthetic genes must have first appeared in the cyanobacterial lineage anyway (Fig. 1). This finding suggests that the same could be true for the core RC genes and that the ancestors of cyanobacteria (“procyanobacteria” or “pro-protocyanobacteria” in ref. 26) should also be considered as candidates for the role of the first phototrophs.

Fig. 1.
Distribution of photosynthetic genes in different lineages of phototrophs and the directions of proposed lateral gene transfer. The phototrophic phyla are depicted in accordance with the depth of their location in modern (and perhaps primordial) microbial ...

Sequence data alone do not allow one to establish the direction of ancient lateral transfer of photosynthetic genes; this requires additional information from independent sources. Important clues to the nature of the first phototrophs can be gained from geological data. Tice and Lowe (61, 62) have provided geological evidence that the Buck Reef Chert, a 250- to 400-m thick rock running along the South African coast, was produced by phototrophic microbial communities ca. 3.4 Gyr ago. They also noted the absence of traces of life in the deeper (>200 m) water environments (61). Tice and Lowe (61, 62) defined the inhabitants of the primordial microbial communities as partially filamentous phototrophs, which, according to the carbon isotopic composition, used the Calvin cycle to fix CO2. The absence of oxidized iron and sulfur in the sediments indicated that neither iron (II) nor sulfide had been used as electron donors (62). By exclusion, this finding leaves atmospheric hydrogen as the most plausible electron donor (62, 63). Because the Calvin cycle is absent in Gram-positive (Heliobacillus) and green sulfur (Chlorobium) phototrophs that use other pathways for CO2 fixation (56), it is unlikely that their ancestors were the phototrophic inhabitants of the Buck Reef Chert. C. aurantiacus does not have the Calvin cycle either, but this pathway has been reported in another representative of green nonsulfur bacteria, Oscillochloris trichoides (64). However, RC2, which is found in Chloroflexi and purple bacteria, would hardly be useful in a hydrogen-driven metabolism. As noted by Olson (24), RC2 uses a quinone as the last electron acceptor, so it would therefore be over-reduced and kinetically incompetent under these conditions.

These observations leave the ancestors of cyanobacteria as the only phototrophs capable of inhabiting the Buck Reef Chert ca. 3.4 Gyr ago. Indeed, the mechanism of CO2 fixation, the morphology of these organisms, and their location in the upper layer of the ancient microbial mat all unite them with modern cyanobacteria. Thus, analysis of the gene content (Table 1) and the geologic evidence both suggest that photosynthesis evolved in the cyanobacterial lineage. Because there is sufficient evidence that anoxygenic photosynthesis preceded oxygenic photosynthesis and was already taking place in the period between 3.5 and 2.5 Gyr ago, we propose that the first phototrophs were procyanobacteria (anoxygenic ancestors of the extant cyanobacteria) that could be responsible for the presence of cyanobacteria-specific biomarkers (2-methylhopanoides) in the 2.7-Gyr-old sediments (65). These anoxygenic procyanobacteria might have relied on RC1 to reduce NAD(P)+ to NAD(P)H and resembled heterocysts, the specialized nitrogen-fixing cells that some modern filamentous cyanobacteria produce in response to starvation for fixed nitrogen. Heterocysts have PSI but no PSII and therefore do not conduct oxygenic photosynthesis (3, 4, 66). Instead, they maintain the anaerobic environment required for nitrogenase activity. Although modern heterocysts are a relatively recent invention (67), their formation can be viewed as a recapitulation of the ancestral cyanobacterial state and confirm the viability of cyanobacteria in a PSI-alone mode in the presence of suitable electron donors.

Driving Forces in the Origin and Evolution of Photosynthesis.

The complexity of the photosynthetic machinery leaves no doubt that its origin and subsequent evolution must have occurred in multiple steps under constant selective pressure. This selective pressure could come from at least two key factors: the necessity for the cells to gain energy and to reduce the damaging effects of solar UV, which was orders-of-magnitude stronger in the absence of the ozone shield than it is now (68). As proposed by several authors, RC1 could evolve by way of multiple duplication events from simpler chlorophyll-binding membrane proteins, similar to the HLIPs of modern cyanobacteria (18, 19, 25). As argued previously, such proteins might serve to protect DNA from the damaging effects of UV light (18, 19). The emergence of RC1 could have been driven by the need for an alternative source of reducing power as the atmospheric hydrogen content gradually decreased. NAD(P)H, which is recycled by RC1, has a redox potential similar to that of hydrogen and can replace hydrogen in certain metabolic chains; the membrane hydrogenase and NADH-dehydrogenase are related enzymes that differ only in the substrate-binding module (69).

Upon gradual oxidation of the atmosphere, the need for further sources of redox equivalents could have driven the formation of a small, high-potential RC2, possibly through fission/reshuffling of RC1 (16, 18, 19, 28). Further depletion of electron donors upon oxidation of the available Fe(II), as discussed by several authors (e.g., refs. 24, 28, and 29), could have driven the evolution of RC2 into the water-oxidizing PSII.

In this framework, modern cyanobacteria inherited their photosynthetic apparatus from ancestral phototrophs, whereas other phototrophic bacterial lineages obtained theirs by way of LGT (Fig. 1). These transfer events must have happened at different stages of evolution: The ancestors of Chlorobium and Heliobacterium must have acquired their RC1 soon after its emergence, when it was still homodimeric, whereas Proteobacteria and Chloroflexus acquired RC2 before it “learned” to oxidize water. Anoxygenic phototrophs usually dwell in the depth of microbial mats. Perhaps, therefore, they were subject to a weaker selective pressure from light and oxygen than those (ancestors of modern cyanobacteria) that remained on the surface, resulting in preservation of ancestral features of the photosynthetic apparatus. Thus, photosynthetic enzymes of anaerobic bacteria can be considered snapshots of the ancient RCs: The homodimeric RC1 of Heliobacillus mobilis (PshA) and Chlorobium tepidum (PscA) are probably more similar to the ancient homodimeric RC1 than the highly evolved heterodimeric PSI (PsaA/PsaB) of modern cyanobacteria.

Materials and Methods

Protein sets for Anabaena (Nostoc) sp. PCC 7120, Synechocystis sp. PCC 6803, T. elongatus BP-1, and Prochlorococcus marinus SS120 were extracted from GenBank (www.ncbi.nlm.nih.gov) and clustered by using the cluster of orthologous group (COG) method (48, 49). Proteins from an additional 11 cyanobacterial genomes (Table 2) were assigned to the resulting protein clusters (CyOGs) by using a modification of the COGNITOR procedure (49), followed by manual verification and analysis of multidomain proteins. CyOGs that were missing representatives of one, two, or three species, as well as CyOGd that contained proteins shorter than 100-aa residues, were compared with the translation of the corresponding genomic DNA sequences by using TBLASTn (70). Detection of homologs of cyanobacterial proteins in organisms from other taxa was performed using Blastp searches against the National Center for Biotechnology Information (NCBI) nonredundant protein database. Phylogenetic distributions of homologs for each CyOG were analyzed by comparing them to prokaryotic and eukaryotic protein families and by checking for bidirectional best hits and domain architecture, as described in refs. 48 and 49. Cyanobacteria-specific CyOGs were defined as those consisting of proteins that did not retrieve noncyanobacterial hits after three iterations of PSI-BLAST run with the default inclusion parameter of E = 0.001.

Supplementary Material

Supporting Information:


We thank Beverly Green and Brian Palenik for helpful comments and Integrated Genomics, Inc., for allowing us access to the genomic sequence data for H. mobilis and T. tepidum. This study was supported by the Intramural Research Program of the National Institutes of Health, the National Library of Medicine (E.V.K., K.S.M., S.L.M., A.S., Y.I.W., and M.Y.G.), the European Community Program SynChips, the Network of Excellence Marine Genomics Europe, the Region Bretagne Program IMPALA (A.D. and F.P.), the Deutsche Forschungsgemeinschaft (A.Y.M.), and the Volkswagen Foundation (A.Y.M.).


cyanobacterial clusters of orthologous groups of proteins
lateral gene transfer
photosystem I
photosystem II
reaction center.


Conflict of interest statement: No conflicts declared.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. DQ831217DQ831236).


1. Altermann W., Kazmierczak J. Res. Microbiol. 2003;154:611–617. [PubMed]
2. Bekker A., Holland H. D., Wang P. L., Rumble D., III, Stein H. J., Hannah J. L., Coetzee L. L., Beukes N. J. Nature. 2004;427:117–120. [PubMed]
3. Haselkorn R. Science. 1998;282:891–892. [PubMed]
4. Meeks J. C., Elhai J. Microbiol. Mol. Biol. Rev. 2002;66:94–121. [PMC free article] [PubMed]
5. Bhaya D. Mol. Microbiol. 2004;53:745–754. [PubMed]
6. Hess W. R. Curr. Opin. Biotechnol. 2004;15:191–198. [PubMed]
7. Dufresne A., Garczarek L., Partensky F. Genome Biol. 2005;6:R14. [PMC free article] [PubMed]
8. Raymond J., Zhaxybayeva O., Gogarten J. P., Gerdes S. Y., Blankenship R. E. Science. 2002;298:1616–1620. [PubMed]
9. Raymond J., Zhaxybayeva O., Gogarten J. P., Blankenship R. E. Philos. Trans. R. Soc. London B. 2003;358:223–230. [PMC free article] [PubMed]
10. Sato N. Genome Inform. 2002;13:173–182. [PubMed]
11. Zhaxybayeva O., Hamel L., Raymond J., Gogarten J. P. Genome Biol. 2004;5:R20. [PMC free article] [PubMed]
12. Blankenship R. E. Photosynth. Res. 1992;33:91–111. [PubMed]
13. Buttner M., Xie D. L., Nelson H., Pinther W., Hauska G., Nelson N. Proc. Natl. Acad. Sci. USA. 1992;89:8135–8139. [PMC free article] [PubMed]
14. Burke D. H., Hearst J. E., Sidow A. Proc. Natl. Acad. Sci. USA. 1993;90:7134–7138. [PMC free article] [PubMed]
15. Pierson B. K. In: Early Life on Earth: Nobel Symposium No. 84. Bengtson S., editor. New York: Columbia Univ. Press; 1994. pp. 161–180.
16. Vermaas W. F. Photosynth. Res. 1994;41:285–294. [PubMed]
17. Lockhart P. J., Larkum A. W., Steel M., Waddell P. J., Penny D. Proc. Natl. Acad. Sci. USA. 1996;93:1930–1934. [PMC free article] [PubMed]
18. Mulkidjanian A. Y., Junge W. Photosynth. Res. 1997;51:27–42.
19. Mulkidjanian A. Y., Junge W. In: The Phototrophic Prokaryotes. Peschek G. A., Löffelhardt W., Schmetterer G., editors. Vol. 51. New York: Kluwer Academic/Plenum; 1999. pp. 805–812.
20. Xiong J., Inoue K., Bauer C. E. Proc. Natl. Acad. Sci. USA. 1998;95:14851–14856. [PMC free article] [PubMed]
21. Xiong J., Fischer W. M., Inoue K., Nakahara M., Bauer C. E. Science. 2000;289:1724–1730. [PubMed]
22. Green B. R., Gantt E. J. Phycology. 2000;36:983–985.
23. Baymann F., Brugna M., Muhlenhoff U., Nitschke W. Biochim. Biophys. Acta. 2001;1507:291–310. [PubMed]
24. Olson J. M. Photosynth. Res. 2001;68:95–112. [PubMed]
25. Garczarek L., Poupon A., Partensky F. FEMS Microbiol. Lett. 2003;222:59–68. [PubMed]
26. Green B. R. In: Light-Harvesting Antennas in Photosynthesis. Green B. R., Parson W. W., editors. Dordrecht, The Netherlands: Kluwer; 2003. pp. 129–168.
27. Gupta R. S. Photosynth. Res. 2003;76:173–183. [PubMed]
28. Rutherford A. W., Faller P. Philos. Trans. R. Soc. London B. 2003;358:245–253. [PMC free article] [PubMed]
29. De Las Rivas J., Balsera M., Barber J. Trends Plant Sci. 2004;9:18–25. [PubMed]
30. Olson J. M., Blankenship R. E. Photosynth. Res. 2004;80:373–386. [PubMed]
31. Mix L. J., Haig D., Cavanaugh C. M. J. Mol. Evol. 2005;60:153–163. [PubMed]
32. Nelson N., Ben-Shem A. BioEssays. 2005;27:914–922. [PubMed]
33. Olson J. M. Photosynth. Res. 2006;88:109–117. [PubMed]
34. Zhang H., Cramer W. A. Methods Mol. Biol. 2004;274:67–78. [PubMed]
35. Dufresne A., Salanoubat M., Partensky F., Artiguenave F., Axmann I. M., Barbe V., Duprat S., Galperin M. Y., Koonin E. V., Le Gall F., et al. Proc. Natl. Acad. Sci. USA. 2003;100:10020–10025. [PMC free article] [PubMed]
36. Nakamura Y., Kaneko T., Sato S., Ikeuchi M., Katoh H., Sasamoto S., Watanabe A., Iriguchi M., Kawashima K., Kimura T., et al. DNA Res. 2002;9:123–130. [PubMed]
37. Inoue H., Tsuchiya T., Satoh S., Miyashita H., Kaneko T., Tabata S., Tanaka A., Mimuro M. FEBS Lett. 2004;578:275–279. [PubMed]
38. Palenik B., Brahamsha B., Larimer F. W., Land M., Hauser L., Chain P., Lamerdin J., Regala W., Allen E. E., McCarren J., et al. Nature. 2003;424:1037–1042. [PubMed]
39. Galperin M. Y., Koonin E. V. Nucleic Acids Res. 2004;32:5452–5463. [PMC free article] [PubMed]
40. Larkin R. M., Alonso J. M., Ecker J. R., Chory J. Science. 2003;299:902–906. [PubMed]
41. Dauvillee D., Stampacchia O., Girard-Bascou J., Rochaix J. D. EMBO J. 2003;22:6378–6388. [PMC free article] [PubMed]
42. Prommeenate P., Lennon A. M., Markert C., Hippler M., Nixon P. J. J. Biol. Chem. 2004;279:28165–28173. [PubMed]
43. Battchikova N., Zhang P., Rudd S., Ogawa T., Aro E. M. J. Biol. Chem. 2005;280:2587–2595. [PubMed]
44. Gornicki P. Int. J. Parasitol. 2003;33:885–896. [PubMed]
45. Tripathi R. P., Mishra R. C., Dwivedi N., Tewari N., Verma S. S. Curr. Med. Chem. 2005;12:2643–2659. [PubMed]
46. Hansel A., Pattus F., Jurgens U. J., Tadros M. H. Biochim. Biophys. Acta. 1998;1399:31–39. [PubMed]
47. Huang F., Fulda S., Hagemann M., Norling B. Proteomics. 2006;6:910–920. [PubMed]
48. Tatusov R. L., Koonin E. V., Lipman D. J. Science. 1997;278:631–637. [PubMed]
49. Tatusov R. L., Galperin M. Y., Natale D. A., Koonin E. V. Nucleic. Acids Res. 2000;28:33–36. [PMC free article] [PubMed]
50. Hanson T. E., Tabita F. R. Proc. Natl. Acad. Sci. USA. 2001;98:4397–4402. [PMC free article] [PubMed]
51. Ashida H., Danchin A., Yokota A. Res. Microbiol. 2005;156:611–618. [PubMed]
52. Herter S., Farfsing J., Gad’On N., Rieder C., Eisenreich W., Bacher A., Fuchs G. J. Bacteriol. 2001;183:4305–4316. [PMC free article] [PubMed]
53. Nakamura Y., Kaneko T., Tabata S. Nucleic. Acids Res. 2000;28:72. [PMC free article] [PubMed]
54. Martin K. A., Siefert J. L., Yerrapragada S., Lu Y., McNeill T. Z., Moreno P. A., Weinstock G. M., Widger W. R., Fox G. E. Photosynth. Res. 2003;75:211–221. [PubMed]
55. Sonnhammer E. L., Koonin E. V. Trends Genet. 2002;18:619–620. [PubMed]
56. Overmann J., Garcia-Pichel F. In: The Prokaryotes: An Evolving Electronic Resource for the Microbiological Community, Release 3.2. Dworkin M., editor. New York: Springer; 2000. available at http://link.springer-ny.com/link/service/books/10125. Accessed March 3, 2006.
57. Choudhary M., Kaplan S. Nucleic Acids Res. 2000;28:862–867. [PMC free article] [PubMed]
58. Igarashi N., Harada J., Nagashima S., Matsuura K., Shimada K., Nagashima K. V. J. Mol. Evol. 2001;52:333–341. [PubMed]
59. Lindell D., Sullivan M. B., Johnson Z. I., Tolonen A. C., Rohwer F., Chisholm S. W. Proc. Natl. Acad. Sci. USA. 2004;101:11013–11018. [PMC free article] [PubMed]
60. Liebl U., Mockensturm-Wilson M., Trost J. T., Brune D. C., Blankenship R. E., Vermaas W. Proc. Natl. Acad. Sci. USA. 1993;90:7124–7128. [PMC free article] [PubMed]
61. Tice M. M., Lowe D. R. Nature. 2004;431:549–552. [PubMed]
62. Tice M. M., Lowe D. R. Geology. 2006;34:37–40.
63. Nisbet E. G., Sleep N. H. Nature. 2001;409:1083–1091. [PubMed]
64. Berg I. A., Keppen O. I., Krasil’nikova E. N., Ugol’kova N. V., Ivanovskii R. N. Mikrobiologiia. 2005;74:258–264.
65. Summons R. E., Jahnke L. L., Hope J. M., Logan G. A. Nature. 1999;400:554–557. [PubMed]
66. Golden J. W., Yoon H. S. Curr. Opin. Microbiol. 2003;6:557–563. [PubMed]
67. Tomitani A., Knoll A. H., Cavanaugh C. M., Ohno T. Proc. Natl. Acad. Sci. USA. 2006;103:5442–5447. [PMC free article] [PubMed]
68. Garcia-Pichel F. Origins Life Evol. Biosphere. 1998;28:321–347. [PubMed]
69. Friedrich T., Scheide D. FEBS Lett. 2000;479:1–5. [PubMed]
70. Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zheng Z., Miller W., Lipman D. J. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...