• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of genoresGenome ResearchCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. Sep 2006; 16(9): 1099–1108.
PMCID: PMC1557764

Phylogenetic analyses of cyanobacterial genomes: Quantification of horizontal gene transfer events

Abstract

Using 1128 protein-coding gene families from 11 completely sequenced cyanobacterial genomes, we attempt to quantify horizontal gene transfer events within cyanobacteria, as well as between cyanobacteria and other phyla. A novel method of detecting and enumerating potential horizontal gene transfer events within a group of organisms based on analyses of “embedded quartets” allows us to identify phylogenetic signal consistent with a plurality of gene families, as well as to delineate cases of conflict to the plurality signal, which include horizontally transferred genes. To infer horizontal gene transfer events between cyanobacteria and other phyla, we added homologs from 168 available genomes. We screened phylogenetic trees reconstructed for each of these extended gene families for highly supported monophyly of cyanobacteria (or lack of it). Cyanobacterial genomes reveal a complex evolutionary history, which cannot be represented by a single strictly bifurcating tree for all genes or even most genes, although a single completely resolved phylogeny was recovered from the quartets’ plurality signals. We find more conflicts within cyanobacteria than between cyanobacteria and other phyla. We also find that genes from all functional categories are subject to transfer. However, in interphylum as compared to intraphylum transfers, the proportion of metabolic (operational) gene transfers increases, while the proportion of informational gene transfers decreases.

Cyanobacteria occupy a diverse range of habitats. The 11 genome sequences included in this study represent freshwater, marine, and hot spring species, including four closely related marine cyanobacteria from the Prochlorococcus/marine Synechococcus group. Based on the shared traits of oxygenic photosynthesis, several single gene analyses (e.g., Giovannoni et al. 1988), and analyses of shared indels (Gupta et al. 2003), all cyanobacteria form a monophyletic phylogenetic group. Indeed, the “coherence” of cyanobacteria—by which we mean monophyly for all or the vast majority of genes—is often considered self-evident, and asserted without elaboration or citation (e.g., Hagen and Meeks 2001; Otero and Vincenzini 2004).

In 1979, Rippka et al. (1979) divided cyanobacteria into five sections based on morphology. However, this classification does not correspond with molecular markers, including ribosomal RNAs. In 16S rRNA phylogenies (Turner 1997; Honda et al. 1999; Turner et al. 1999; Wilmotte and Herdman 2001), cyanobacteria form several statistically supported clusters different from the sections of Rippka et al. (1979), and with no clearly resolved relationships among the clusters. This poor resolution might be explained by rapid radiation or by recombination within 16S rRNA—enough to scramble phylogenetic signal (e.g., Yap et al. 1999; Boucher et al. 2004; Miller et al. 2005; Morandi et al. 2005).

Horizontal (or lateral) gene transfer (HGT), potentially followed by recombination with or replacement of resident homologs (orthologous replacement), is now recognized as a major force shaping evolutionary histories of prokaryotes (e.g., Koonin et al. 2001; Zhaxybayeva and Gogarten 2002; Boucher et al. 2003) and eukaryotes (e.g., Mitreva et al. 2005). Among methods for detecting instances of HGT are observations of unusual evolutionary patterns in gene phylogenies, patchy phylogenetic distribution, and atypical nucleotide composition (Ochman et al. 2000; Ragan 2001). These different methodologies produce varying estimates of HGT.

Atypical nucleotide composition methods indicate that individual cyanobacterial genomes have acquired between 9.5% and 16.6% of their genes through HGT (Ochman et al. 2000; Nakamura et al. 2004). Because acquired genes eventually “ameliorate” to have the compositional characteristics of their new environment, these are almost certainly serious underestimates. Individual instances of molecular markers in cyanobacteria with contradicting phylogenetic histories continue to accumulate as well (e.g., Rudi et al. 1998; Seo and Yokota 2003). In an earlier study, some of us (Zhaxybayeva et al. 2004) performed genome-wide bipartition analyses of 678 data sets of orthologous genes (or data sets, for short) present in 10 cyanobacterial genomes. The plurality consensus of these data sets was poorly resolved. Nevertheless, many individual gene families contradicted it, suggesting that gene families in cyanobacterial genomes have complex, frequently noncongruent, phylogenetic histories. Population studies of Microcoleus chthonoplastes and Nodularia sp. indicate very high rates of homologous recombination (Barker et al. 2000; Lodders et al. 2005). Cyanophages infecting marine cyanobacteria have been reported to contain genes important for photosynthesis (Mann et al. 2003; Lindell et al. 2004; Millard et al. 2004; Sullivan et al. 2005; Zeidner et al. 2005), and likely mediate transfer and recombination of these genes among marine cyanobacteria (Zeidner et al. 2005). Indeed, Zeidner et al. (2005) postulated that the diversity accumulated in phage psbA genes may serve as an evolutionary reservoir for hosts and increases the hosts’ chances of adapting to changing environments. Additional evidence for the occurrence of HGT in cyanobacteria comes from laboratory experiments with Synechocystis sp. PCC6803 and Thermosynechococcus elongatus BP1 in which mutants are made through the recombination with exogenous DNA (e.g., Ikeuchi and Tabata 2001; Iwai et al. 2004).

In spite of such evidence for HGT involving individual gene families in cyanobacteria and many other bacterial groups, coherence of the phyla is often assumed, and invoked as evidence that HGT is in the long run a weak force, and no serious challenge to the historical accuracy of the rRNA-based Tree of Life. There have been few systematic and exhaustive assessments of the extent to which bacterial phyla really are coherent (Beiko et al. 2005; Kunin et al. 2005). Furthermore, coherence, even if well documented, does not mean that HGT is unimportant or infrequent within phyla. There are many reasons to expect that within-phylum HGT will be more vigorous and more fruitful than between-phylum exchange. In some cases, members of a phylum are more likely to occupy similar environments, and encounter each others’ DNA. There will be phylum-specific constraints based on physiology: for instance, cyanobacteria could not profitably incorporate individual genes of the methanogenesis pathway, but there might be many circumstances in which variant photosynthetic genes could benefit near or distant relatives within the cyanobacteria. Finally, between-phylum differences in genome organization (Lawrence and Hendrickson 2005) and in the machinery of gene expression and its regulation may constrain effective HGT.

More frequent within-phylum orthologous replacement (and homologous recombination) would serve to maintain similarity between members, while allowing divergence between phyla. Thus, the preferential sharing of a common gene pool could be itself the principal cause of coherence (Gogarten et al. 2002; Olendzenski et al. 2002). To test this, comparative estimations of the extent of inter and intragroup transfers are needed. In this study, we analyze sets of orthologous genes from 11 available cyanobacterial genomes and their homologs in 168 other prokaryotes and attempt to quantify the number of transfers that occurred within cyanobacteria and between cyanobacteria and other phyla.

Results

Selection of sets of orthologous genes

Detection of orthologous genes is an important step in attempts to estimate HGT events. Poor selection of sets of orthologous genes leads to hidden paralogy, which is a serious problem in phylogenetic reconstruction. Several different approaches are used frequently to detect sets of orthologous genes (e.g., Zhaxybayeva and Gogarten 2002; Lerat et al. 2003; Tatusov et al. 2003; Harlow et al. 2004); none is perfect. Our very conservative method (see Zhaxybayeva and Gogarten 2002 for methodology), requiring a reciprocal top-scoring BLAST hit for each member of a set of orthologs, may miss many sets of legitimate orthologous genes, but it minimizes data sets contaminated with paralogs.

Many genome-wide analyses (including analyses of cyanobacterial genomes in Zhaxybayeva et al. 2004) are based on a core set of genes, that is, genes present in all analyzed genomes. This restricts such studies to a limited number of very conserved genes, and as more genomes are added to improve taxon sampling, the size of the core set of genes decreases (Charlebois and Doolittle 2004). Using our selection criterion, there are, for instance, only 663 genes present in all 11 of the cyanobacterial genomes, although there are 3804 found in at least four genomes, the minimum for any comparative phylogenetic analysis (see Table 1). Here we use a “relaxed core” of 1128 genes, those identified in at least nine of the 11 genomes, reasoning that such nearly ubiquitous genes probably determine many of the characters by which cyanobacteria are judged to be a “coherent” group.

Table 1.
Number of sets of orthologous genes detected in 11 cyanobacterial genomes

Embedded quartet decomposition analyses

For this analysis, we developed a new tool that allows the inclusion of sets of orthologous genes with missing data. The tool is designed to examine all possible “embedded quartets” for each set of orthologous genes detected in a group of analyzed genomes, that is, all possible four-taxon trees that are consistent with (embedded within) a corresponding gene tree (Fig. 1). This method—“embedded quartet decomposition analyses” or “quartet decomposition,” for short—is conceptually similar to the spectral analysis method of Hendy and Penny (1993) and Lento et al. (1995). A data set of orthologous genes can be, in principle, as small as four taxa (containing one quartet) or as large as 11 taxa (containing 330 embedded quartets). The resulting embedded quartets can be summarized to depict the evolutionary relationships among the genomes that are supported by a plurality of orthologous sets, as well as to delineate genes that conflict with this plurality consensus. We applied this method to the relaxed core of cyanobacteria.

Figure 1.
Illustration of an embedded quartet in a gene tree. In each 11-taxon unrooted gene tree (shown in gray thin lines), we look at the relationship of any four taxa at a time (shown in thick black lines is an example of an embedded tree for taxa 1, 4, 9, ...

Screening quartets for minimization of false inferences

Quartets with a very short internal branch can produce misleading results because of the absence of sufficient phylogenetic information, thus we removed from our relaxed core data sets all quartets with fewer than three amino acid substitutions along the internal branch: 27 data sets had at least one embedded quartet with such a very short internal branch. To reduce long branch attraction artifacts (Felsenstein 1978), we also excluded embedded quartets with any unbroken external branch more than 10 times longer than the internal branch, and 798 data sets had at least one quartet excluded at this step.

Power of detection as assessed through simulations

Even in the absence of phylogenetic reconstruction artifacts, one expects some false positives at any bootstrap support cutoff (i.e., phylogenies that conflict with the plurality signal by chance), because of the finite amount of phylogenetic data used for phylogenetic reconstruction. False positives will be particularly frequent among quartets resolved only by a few gene families. We performed genome evolution simulations without any gene loss or gain, that is, genes in genomes following strictly vertical inheritance (see Methods for details), to estimate the frequency of such false positives and the likely efficacy of the screening methods for minimizing false inference described above.

We found that when >30% of the data sets resolve an embedded quartet (i.e., support one of the three possible tree topologies with at least 80% bootstrap support), the number of false positives is negligible (see Table 2 and Supplemental material). However, simulations introducing HGT events showed that the conservative approach of excluding quartets resolved by <30% of the data sets increases the number of false negatives (i.e., undetected transfer events). Simulations of either sort produced similar results whether analyzed with PhyML (Guindon and Gascuel 2003) or TREE-PUZZLE (Schmidt et al. 2002), thus we used the latter, faster, method in subsequent study of real data. We conclude that our screening methods likely result in underestimates of HGT, overall.

Table 2.
Simulation results: Number of data sets that conflict with the plurality signal at different cutoff levels, at 80% bootstrap support

Plurality signal and estimation of conflicts within the cyanobacterial group based on analyses of the relaxed core

All embedded quartets retained after removal of those with short internal or long external branches were resolved by at least 30% of data sets. We summarized the plurality support for all embedded quartets across all data sets as well as conflicts with plurality in a diagram that we call a quartet spectrum (because we assess all possible combinations of four taxa, providing a full spectrum of possible relationships) (see Fig. 2). All quartet topologies supported by a plurality of data sets are compatible with each other, and therefore only one most parsimonious tree exists (a so-called perfect phylogeny, Felsenstein 2004). This tree, found using a supertree reconstruction algorithm (see Methods), is shown in Figure 3.

Figure 2.
Quartet decomposition analysis of cyanobacteria. Panel A illustrates a component of quartet decomposition analysis. Each embedded quartet is represented by a vertical bar and a black dot. The black dot indicates how many data sets contain this embedded ...
Figure 3.
Visualization of the evolutionary history of cyanobacteria as inferred from quartet decomposition analyses. The unrooted tree topology was calculated from the embedded quartets supported by the plurality of sets of orthologous genes, and it is shown in ...

While the plurality signal supports one fully resolved tree topology, we found that a substantial proportion of data sets (685 data sets, or roughly 61% of analyzed data sets) exhibits conflict with the plurality signal in at least one embedded quartet. Some of these conflicts (those involving alternative sister relationships between terminal taxa and having at least 80% bootstrap support) are visualized in Figure 3. In the Supplemental material, we provide trees for the 131 data sets involved in conflicts indicated in Figure 3. One example is provided in Figure 4. Among genes conflicting with the plurality signal are genes involved in photosynthesis (see Table 3), including genes recently found in phages infecting Prochlorococcus (Lindell et al. 2004) and marine Synechococcus (Millard et al. 2004).

Figure 4.
Example of intraphylum transfer: a hemolysin-like protein. This example of horizontal gene transfer was extracted from the list of data sets exhibiting conflicts with the plurality signal. This gene family has detectable homologs in other phyla, but phylogenetic ...
Table 3.
Photosynthesis genes that are observed to conflict with the plurality signal

Incongruence of gene histories among Prochlorococcus/Synechococcus

Among the 11 genomes, four belong to the Prochlorococcus/ marine Synechococcus group. Members of the Prochlorococcus genus have only been recently discovered because of their anomalously low fluorescence and small size (Chisholm et al. 1988). Marine Synechococcus and Prochlorococcus are proposed to diverge from a common phycobilisome-containing ancestor (Ting et al. 2002). While marine Synechococcus still uses phycobilisomes as light-harvesting antennae, members of the Prochlorococcus genus lack phycobilisomes and use a different antenna complex (Pcb), as well as possessing derivatives of chlorophyll a and b that are unique to this genus (for a recent review, see Partensky et al. 1999). In addition, marine Synechococcus and Prochlorococcus are adapted to different ecological niches: Marine Synechococcus is prevalent in coastal waters, while Prochlorococcus is ubiquitous in open subtropical and tropical ocean. Within Prochlorococcus marinus, two “ecotypes” are differentiated: low-light-adapted and high-light-adapted types (Rocap et al. 2003). In the 16S rRNA tree, high-light-adapted Prochlorococcus spp. arise from within a low-light-adapted clade (Ting et al. 2002).

We find numerous conflicts between these four genomes, those involving highly supported apparent transfers between terminal taxa being shown in Figure 3. Similarly, in a recent study, Beiko et al. (2005) report >250 HGT events among these marine cyanobacteria. Although these genomes are reported to have accelerated rates of evolution (Dufresne et al. 2005), and hence could be more prone to the long branch attraction artifact, the quartets with long branches were excluded from our analyses (see above). Interestingly, the relationship among these four genomes captured by the plurality of gene families supports neither the relationship inferred from phylogenetic analyses of 16S rRNA (e.g., Ting et al. 2002; Dufresne et al. 2005), nor the grouping based on proposed ecotypes (Ting et al. 2002; Rocap et al. 2003). This can be explained by rampant gene flow among these genomes, with the plurality consensus no longer reflecting ecotype physiology. Notably, the majority of observed conflicts with the plurality occur between the two low-light-adapted ecotypes (see Fig. 3).

Transfers between cyanobacteria and other phyla

Quartet decomposition analysis only detects conflicts within the group of analyzed genomes. However, observed conflicts could be instances of incongruence produced by transfers from outside of cyanobacteria into only one or a few cyanobacterial lineages. To correct for this and to estimate the number of transfers that occurred between the cyanobacteria and organisms from other phyla, we added homologous genes from other completely sequenced genomes spanning Bacterial and Archaeal domains. Phylogenetic analysis of such “extended data sets” identifies putative instances of horizontal gene transfer from/to the cyanobacteria.

Out of 1128 data sets, 879 had detectable homologs in selected prokaryotic genomes, and the remaining 249 data sets were “cyanobacteria-specific” (and therefore not suitable for estimation of interphylum transfers). Seven hundred of these 879 data sets were “phylogenetically useful,” that is, were sufficiently resolved and either had cyanobacteria as a coherent group with 80% bootstrap support or had other taxa grouping within cyanobacteria at 80% bootstrap support. Of these 700 data sets, 540 support cyanobacteria as a coherent group (~77%), while 160 data sets (~23%) either have sequences from other taxa interspersed among cyanobacteria or some cyanobacterial sequences grouping somewhere else, suggesting possible transfer events to or from cyanobacteria (an example of such a data set is shown in Fig. 5, and additional examples are available as Supplemental material). Interestingly, 294 out of 540 data sets that support

Figure 5.
Example of horizontal gene transfer to cyanobacteria: threonyl tRNA synthetase. This is a phylogenetic tree reconstructed from a data set in which the Anabaena sp. genome did not have a detectable homolog in its annotation. In this tree, sequences of ...

cyanobacteria as a monophyletic group (54%, or 42% of the 700 phylogenetically useful data sets) conflict with the plurality consensus based on the quartet decomposition analyses (see above). This estimation suggests that there are more conflicts observed within cyanobacteria than between cyanobacteria and other phyla.

Distribution of genes among functional categories

We looked at the distribution of all analyzed cyanobacterial gene families, as well as the distribution of gene families present in extended data sets, across functional categories as defined in the COG database (see Figs. 6 and and7).7). Functional category analysis of cyanobacterial data sets conflicting with the plurality signal shows that genes from all functional categories are among the conflicting genes (see Fig. 6), including genes from information storage and processing categories (categories J and K, according to the COG database abbreviations) (Tatusov et al. 2003). However, when we analyzed the distribution of extended data sets across functional categories (see Fig. 7), we found that in inter-phylum transfers, metabolic genes are overrepresented, and “information storage and processing” genes are underrepresented.

Figure 6.
Distribution of cyanobacterial sets of orthologous genes across functional categories. The functional categories are according to the COG database, March 2003 release (Tatusov et al. 2003). Panel A shows the distribution of all 1128 analyzed genes, while ...
Figure 7.
Distribution of phylogenetically useful extended genes across functional categories. Panel A shows the distribution of 700 phylogenetically useful extended data sets. Panel B shows the distribution of 160 sets where cyanobacteria do not form a monophyletic ...

Discussion

Several recent studies attempt to estimate the number of HGT events at different taxonomic levels (e.g., Snel et al. 2002; Lerat et al. 2003; Mirkin et al. 2003; Beiko et al. 2005; Ge et al. 2005; Kunin et al. 2005). In some, investigators were only interested in the transfer of novel genes into genomes (Snel et al. 2002; Mirkin et al. 2003; Kunin et al. 2005), thus neglecting orthologous replacement, and underestimating the total number of transfer events among genomes of interest. In others, addressing orthologous replacement, investigators have limited themselves to analyses of very strictly defined (ubiquitous) core genes (Lerat et al. 2003; Ge et al. 2005), also underestimating the total number of horizontally transferred genes. In studies of either sort, consensus trees derived from bootstrap or posterior probability analyses, or trees based on concatenated data sets have been used as references against which to assess HGT. The former are often only partially resolved (e.g., as in Snel et al. 2002) and thus preclude detection of many phylogenetic conflicts, while the latter, as commonly used, entail the assumption that most genes do have a single history. For instance, Lerat et al. (2003) assume that individual data sets that do not statistically reject a tree based on their concatenated sequences have not experienced HGT when, in fact, their weak phylogenetic signals are compatible with many conflicting topologies (Bapteste et al. 2004).

To avoid such problems, we make an a priori assumption that individual sets of orthologous genes may not have to have the same evolutionary history, and therefore are not suitable for concatenation. Quartet decomposition analyses also avoid the “averaging” effect of consensus trees, since they partition trees inferred for each bootstrapped sample into sets of possible embedded quartets, and allow summarizing data sets with varying numbers of taxa in a single diagram (Fig. 2). In addition, the quartet decomposition method represents an improvement over methods that rely on analyses of bipartitions, since the support values for individual embedded quartets do not decay when the internal branches become shorter because of more sequences being included in the analysis.

While a majority of analyzed extended data sets (~77%) support coherence (monophyly) of cyanobacteria, some do not, and we find significant conflicting phylogenetic signals within cyanobacteria (~61% of analyzed data sets). Such conflicts could be caused by (1) instances of horizontally transferred genes; (2) differentially lost paralogs, which are impossible to discriminate from transfer events; (3) systematic artifacts of phylogenetic reconstruction (e.g., long branch attraction, compositional biases, or biases introduced through wrong models of phylogenetic reconstruction); and (4) false positives caused by insufficient phylogenetic signal. All analyses based on phylogenetic inference face these problems (Gogarten and Townsend 2005), and we took necessary precautions to minimize their impact on our genome-wide analyses (see Results for more details). Indeed, our simulation studies suggest that our screening approach is conservative, and likely to result in underestimating the extent of HGT. No doubt, among the observed conflicts are instances of real transfer events. For example, we observe conflicting signals in genes that are found in phages (see Table 3 and Lindell et al. 2004; Millard et al. 2004; Sullivan et al. 2005; Zeidner et al. 2005), and therefore are very probable candidates for HGT. (At the same time, our simulation study shows how frequently false positives arise, and casts a shadow on the reliability of phylogenetic reconstruction in general.)

False negatives are also inevitable. Transfers between sister taxa are undetectable, as will be many from unsequenced donors with no sequenced close relatives. Simulations confirm that many transfers escape detection (Table 2), probably because of the causes mentioned above. Thus, the number of detected transfers in cyanobacteria that we report here should, indeed, be considered an underestimate.

Although a majority of our data sets conflict with the plurality signal, all plurality quartets are compatible with a single fully resolved phylogenetic tree (see Fig. 3). Does the plurality topology reflect an “organismal phylogeny”? Gary Olsen has suggested a rope metaphor to illustrate the evolution of organisms and their genes (cited in Zhaxybayeva et al. 2004). The rope (rep-resenting an organismal lineage) has continuity despite the fact that no individual rope fiber (representing genes) persists throughout the entire rope. Using this metaphor, we might define the organismal lineage as that determined by the plurality of genes passed on over short time intervals. While this metaphor yields a theoretical definition of organismal lineage, it is not clear that the organismal lineage always can be reconstructed from the character of the individual fibers (genes).

Given this definition of organismal lineage, our plurality topology can be interpreted as a snapshot of relationships among extant cyanobacterial lineages. However, the picture is incomplete without providing information about observed conflicts to the plurality signal. Figure 3 depicts 135 conflicts observed between the tips of the plurality topology. This accounts only for a subset of all 685 observed conflicts, since the majority of transfers, deeper in the tree, affect the positions of multiple taxa.

The relationships recovered for the Prochlorococcus/ Synechococcus group point toward extensive gene flow between well-characterized groups of organisms.

The three Prochlorococcus marinus strains share many derived characteristics including cell shape, environment, type of antenna pigments (Partensky et al. 1999), and a distinctive threonyl tRNA synthetase likely acquired by HGT (Fig. 5). These synapomorphies notwithstanding, the plurality signal recovered from our analysis and the analysis by Beiko et al. (2005) group one of the P. marinus strains as sister taxon to a marine Synechococcus. These conflicting phylogenetic signals reveal a fuzzy species boundary that was postulated for prokaryotes (Lawrence 2002): under this model, only the neighborhood of genes conferring ecological distinctiveness is expected to conform to the biological species concept, whereas other genes recombine freely across the species boundary. However, fuzzy species boundaries are also found in eukaryotes without post-mating barriers. For example, in incipient species of Darwin's finches, frequent introgression can make some individuals characterized as belonging to the same species by morphology and mating behavior genetically more similar to a sister species (Grant et al. 2004).

Distribution of genes across functional categories shows that genes from all functional categories are transferred (see Figs. 6 and and7).7). We do not see a bias toward any biological function among the intraphylum HGT events (see Fig. 6), contradicting a recent report by Nakamura et al. (2004) and the complexity hypothesis (Jain et al. 1999).

The fact that we detect that ~50% of extended gene families putatively have a history of HGT (either between cyanobacteria and other phyla, or within cyanobacteria, or both) suggests that HGT plays an important role in the evolution of cyanobacteria, and the relationships among the taxa of this phylogenetic group cannot be represented by a strictly bifurcating tree. We find more conflicts within cyanobacteria than between cyanobacteria and other phyla, as did Beiko et al. (2005), using different methods of gene family selection, phylogenetic reconstruction, and HGT identification. Thus, our results are compatible with the hypothesis that HGT can reinforce coherence of a phylogenetic group.

Nevertheless, cyanobacteria are far from a fully coherent group (all genes supporting monophyly). Twenty-three percent of the 700 data sets for which monophyly was tested failed the test, showing non-cyanobacteria within the cyanobacterial clade, or cyanobacteria embedded within other phyla. Interestingly, even among those genes supporting cyanobacterial monophyly, a majority showed evidence of HGT within the cyanobacteria.

Methods

Quartet decomposition analyses

We analyzed 11 cyanobacterial genomes from NCBI and JGI databases: Anabaena sp. PCC7120, Trichodesmium erythraeum IMS101, Synechocystis sp. PCC6803, Prochlorococcus marinus CCMP1375 (also known as SS120), Prochlorococcus marinus MED4 (also known as CCMP1986), Prochlorococcus marinus MIT9313, marine Synechococcus WH8102, Thermosynechococcus elongatus BP-1, Gloeobacter violaceus PCC7421, Nostoc punctiforme ATCC29133, and Crocosphaera watsonii WH8501. We detected sets of orthologous protein-coding genes defined as mutual fully transitive reciprocal BLASTP (Altschul et al. 1997) hits (with E-value below 10−4) (see Zhaxybayeva and Gogarten 2002 for methodology). There are 3804 genes present in at least four of the genomes (see Table 1). In the analyses of the “relaxed core” presented, we used the 1128 genes present in at least nine of the genomes. Each data set was aligned using the CLUSTALW program version 1.83 (Thompson et al. 1994). For each data set, the shape parameter for a Γ distribution to approximate among-site rate variation (Yang 1994) was estimated using TREE-PUZZLE version 5.2 (Schmidt et al. 2002) with four discrete categories. One hundred bootstrap samples were generated using the SEQBOOT program from PHYLIP package version 3.6 (Felsenstein 1993), and for each bootstrap sample, a distance matrix was calculated using TREE-PUZZLE version 5.2 with the shape parameter set as estimated for the original data set. Phylogenetic trees from the distance matrices were calculated with the NEIGHBOR program from the PHYLIP package version 3.6 (Felsenstein 1993). For each data set of orthologous genes, we generated a list of embedded quartets (i.e., all possible combinations of four taxa contained in the data set). For each embedded quartet in each data set the bootstrap support vector was calculated, that is, the bootstrap support for each of the three alternative quartet topologies (see Zhaxybayeva and Gogarten 2003 for methodology and discussion of advantages of embedded quartet analyses). False inferences were screened as described in Results, and quartets with at least 80% bootstrap support for one of three possible unrooted tree topologies were summarized in a quartet spectrum diagram (see Fig. 2).

Plurality signal reconstruction

Quartet topologies that were supported by a plurality of data sets were used to reconstruct a supertree using the “matrix representation using parsimony” (MRP) method (Baum 1992; Ragan 1992) as implemented in Clann version 2.0.2 (Creevey and McInerney 2005). The resulting matrix was analyzed in PAUP* version 4.0beta10 (Swofford 1998) using an exhaustive tree space search to find the most parsimonious tree.

Functional category assignments

Functional categories were assigned to sets of orthologous genes by performing BLASTP searches of the COG database, March 2003 release (Tatusov et al. 2003), choosing the category of the top-scoring BLAST hit.

Simulations

We performed simulations of genome evolution using EvolSimulator (http://bioinformatics.org.au/evolsim/). Seven hundred genes were simulated for 10,000 generations in a dynamic population of genomes, with speciation balancing extinction (at a nominal rate of 0.015 events per generation) to maintain ~50 extant lineages at any given time following a brief initial phase of population growth. Disabling paralogous duplication as well as gene loss, each genome maintained exactly 700 genes whose orthology could thus be perfectly tracked. Parameters affecting sequence evolution were selected to reproduce a level of sequence divergence similar to that observed in the breadth of the cyanobacterial phylum. These parameters control mutation rates and biases at the gene and genome level, as well as per-residue substitution acceptabilities of the resulting proteins in a model that will be described elsewhere (R.G. Beiko and R.L. Charlebois, unpubl.).

In simulations to estimate false positives, no HGT was allowed. We selected 11 out of the 50 simulated genomes and performed the quartet decomposition analysis as described above. In addition, the same analysis was performed but using phylogenetic trees calculated instead with the PhyML program, version 2.4.4 (Guindon and Gascuel 2003).

To estimate false negatives (i.e., instances of HGT that are not detected), the second and third sets of simulations were conducted using the same parameters, but permitting relations-biased HGT (more transfer among more recently diverged genomes), at one of two rates (nominally 1.0 event per generation, and nominally 0.5 events, respectively, within the entire population). Each of the resulting 50 genomes had ~22%–25% of genes with a history of HGT in the one simulation, and 11%–13% of genes with a history of HGT in the other simulation. We selected the same 11 genomes and performed the quartet decomposition analysis as described above. Since not every family that had a history of HGT during a simulation would have an impact on the phylogeny of the subset of 11 genomes used in the analysis, we corrected the number of genes with a history of transfer only to include those whose history could cause phylogenetic incongruities in the 11-taxon subtree.

Extended data sets analyses

We added homologous sequences from 168 sequenced genomes (the list of genomes is available as Supplemental material) to each cyanobacterial data set by performing BLASTP searches and keeping the top-scoring hits for each cyanobacterial sequence in a data set with E-values below 10−20. Sequences within an extended data set (other than cyanobacterial) with 99% or higher identity at the amino acid level were excluded from further analyses (to reduce the size of an extended data set). The extended data sets were aligned using CLUSTALW version 1.83 (Thompson et al. 1994). Sites in which at least 50% of the taxa had a gap were removed. The shape parameter α for the Γ distribution was estimated using TREE-PUZZLE version 5.2 (Schmidt et al. 2002). One hundred bootstrap samples were generated for each extended data set using SEQBOOT from the PHYLIP package. For each bootstrap sample, a distance matrix was calculated in TREE-PUZZLE (Schmidt et al. 2002) using discrete approximation of the Γ distribution with four categories and with a pre-calculated value of the shape parameter α. Using these maximum likelihood distances, phylogenetic trees were calculated with the NEIGHBOR program of the PHYLIP package. Data sets were divided into three categories based on the topologies of corresponding phylogenetic trees: (1) cyanobacteria form a monophyletic group with at least 80% bootstrap support; (2) other taxa intersperse with cyanobacteria with at least 80% bootstrap support; (3) insufficiently resolved to support either of the above. The latter were considered phylogenetically uninformative and were excluded from further analyses.

Other software used

Most of the scripts for data analyses were written in Perl and Java (the scripts are available as Supplemental material). Java programs utilized the PAL library (Drummond and Strimmer 2001). The sets of orthologous genes were detected with the help of a MySQL database (http://www.mysql.com). Quartet spectra were plotted using GnuPlot (http://www.gnuplot.info). Combinations of M out of N taxa were generated using Chase's combinatorial algorithm (Chase 1970).

Acknowledgments

This work was supported through NASA Exobiology Program (NAG5-11470), NASA AISR (NNG04GP90G), and NSF Microbial Genetics Program (MCB-0237197) grants to J.P.G. and through CIHR (MOP-4467) and Genome Atlantic grants to W.F.D. O.Z. is supported through a CIHR Postdoctoral Fellowship and is an honorary Killam Postdoctoral Fellow at Dalhousie University.

Footnotes

Supplemental material is available online at http://www.genome.org. and http://carrot.mcb.uconn.edu/cyano/.

Article published online ahead of print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5322306.

References

  • Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J., Zhang J., Zhang Z., Miller W., Lipman D.J., Zhang Z., Miller W., Lipman D.J., Miller W., Lipman D.J., Lipman D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
  • Andersson J.O., Sarchfield S.W., Roger A.J., Sjogren A.M., Davis L.A., Embley T.M., Sarchfield S.W., Roger A.J., Sjogren A.M., Davis L.A., Embley T.M., Roger A.J., Sjogren A.M., Davis L.A., Embley T.M., Sjogren A.M., Davis L.A., Embley T.M., Davis L.A., Embley T.M., Embley T.M. Gene transfers from Nanoarchaeota to an ancestor of diplomonads and parabasalids—Phylogenetic analyses of diplomonad genes reveal frequent lateral gene transfers affecting eukaryotes. Mol. Biol. Evol. 2005;22:85–90. [PubMed]
  • Bapteste E., Boucher Y., Leigh J., Doolittle W.F., Boucher Y., Leigh J., Doolittle W.F., Leigh J., Doolittle W.F., Doolittle W.F. Phylogenetic reconstruction and lateral gene transfer. Trends Microbiol. 2004;12:406–411. [PubMed]
  • Barker G.L., Handley B.A., Vacharapiyasophon P., Stevens J.R., Hayes P.K., Handley B.A., Vacharapiyasophon P., Stevens J.R., Hayes P.K., Vacharapiyasophon P., Stevens J.R., Hayes P.K., Stevens J.R., Hayes P.K., Hayes P.K. Allele-specific PCR shows that genetic exchange occurs among genetically diverse Nodularia (cyanobacteria) filaments in the Baltic Sea. Microbiol. 2000;146:2865–2875. [PubMed]
  • Baum B. Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon. 1992;41:3–10.
  • Beiko R.G., Harlow T.J., Ragan M.A., Harlow T.J., Ragan M.A., Ragan M.A. Highways of gene sharing in prokaryotes. Proc. Natl. Acad. Sci. 2005;102:14332–14337. [PMC free article] [PubMed]
  • Boucher Y., Douady C.J., Papke R.T., Walsh D.A., Boudreau M.E., Nesbo C.L., Case R.J., Doolittle W.F., Douady C.J., Papke R.T., Walsh D.A., Boudreau M.E., Nesbo C.L., Case R.J., Doolittle W.F., Papke R.T., Walsh D.A., Boudreau M.E., Nesbo C.L., Case R.J., Doolittle W.F., Walsh D.A., Boudreau M.E., Nesbo C.L., Case R.J., Doolittle W.F., Boudreau M.E., Nesbo C.L., Case R.J., Doolittle W.F., Nesbo C.L., Case R.J., Doolittle W.F., Case R.J., Doolittle W.F., Doolittle W.F. Lateral gene transfer and the origins of prokaryotic groups. Annu. Rev. Genet. 2003;37:283–328. [PubMed]
  • Boucher Y., Douady C.J., Sharma A.K., Kamekura M., Doolittle W.F., Douady C.J., Sharma A.K., Kamekura M., Doolittle W.F., Sharma A.K., Kamekura M., Doolittle W.F., Kamekura M., Doolittle W.F., Doolittle W.F. Intragenomic heterogeneity and intergenomic recombination among haloarchaeal rRNA genes. J. Bacteriol. 2004;186:3980–3990. [PMC free article] [PubMed]
  • Charlebois R.L., Doolittle W.F., Doolittle W.F. Computing prokaryotic gene ubiquity: Rescuing the core from extinction. Genome Res. 2004;14:2469–2477. [PMC free article] [PubMed]
  • Chase P. Algorithm 382: Combinations of M out of N objects [G6] Commun. ACM. 1970;13
  • Chisholm S.W., Olson R.J., Zettler E.R., Goericke R., Waterbury J.B., Welschmeyer N.A., Olson R.J., Zettler E.R., Goericke R., Waterbury J.B., Welschmeyer N.A., Zettler E.R., Goericke R., Waterbury J.B., Welschmeyer N.A., Goericke R., Waterbury J.B., Welschmeyer N.A., Waterbury J.B., Welschmeyer N.A., Welschmeyer N.A. A novel free-living prochlorophyte abundant in the oceanic euphotic zone. Nature. 1988;334:340–343.
  • Creevey C.J., McInerney J.O., McInerney J.O. Clann: Investigating phylogenetic information through supertree analyses. Bioinformatics. 2005;21:390–392. [PubMed]
  • Drummond A., Strimmer K., Strimmer K. PAL: An object-oriented programming library for molecular evolution and phylogenetics. Bioinformatics. 2001;17:662–663. [PubMed]
  • Dufresne A., Garczarek L., Partensky F., Garczarek L., Partensky F., Partensky F. Accelerated evolution associated with genome reduction in a free-living prokaryote. Genome Biol. 2005;6 [PMC free article] [PubMed]
  • Felsenstein J. Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zool. 1978;27:401–410.
  • Department of Genetics, University of Washington; Seattle: 1993. PHYLIP (Phylogeny Inference Package) Distributed by the author. ———.
  • Inferring phylogenies. Sinauer; Sunderland, MA: 2004. ———.
  • Ge F., Wang L.-S., Kim J., Wang L.-S., Kim J., Kim J. The cobweb of life revealed by genome-scale estimates of horizontal gene transfer. PLoS Biol. 2005;3 [PMC free article] [PubMed]
  • Giovannoni S.J., Turner S., Olsen G.J., Barns S., Lane D.J., Pace N.R., Turner S., Olsen G.J., Barns S., Lane D.J., Pace N.R., Olsen G.J., Barns S., Lane D.J., Pace N.R., Barns S., Lane D.J., Pace N.R., Lane D.J., Pace N.R., Pace N.R. Evolutionary relationships among cyanobacteria and green chloroplasts. J. Bacteriol. 1988;170:3584–3592. [PMC free article] [PubMed]
  • Gogarten J.P., Townsend J.P., Townsend J.P. Horizontal gene transfer, genome innovation and evolution. Nat. Rev. Microbiol. 2005;3:679–687. [PubMed]
  • Gogarten J.P., Doolittle W.F., Lawrence J.G., Doolittle W.F., Lawrence J.G., Lawrence J.G. Prokaryotic evolution in light of gene transfer. Mol. Biol. Evol. 2002;19:2226–2238. [PubMed]
  • Grant P.R., Grant B.R., Markert J.A., Keller L.F., Petren K., Grant B.R., Markert J.A., Keller L.F., Petren K., Markert J.A., Keller L.F., Petren K., Keller L.F., Petren K., Petren K. Convergent evolution of Darwin's finches caused by introgressive hybridization and selection. Evolution Int. J. Org. Evolution. 2004;58:1588–1599. [PubMed]
  • Guindon S., Gascuel O., Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 2003;52:696–704. [PubMed]
  • Gupta R.S., Pereira M., Chandrasekera C., Johari V., Pereira M., Chandrasekera C., Johari V., Chandrasekera C., Johari V., Johari V. Molecular signatures in protein sequences that are characteristic of cyanobacteria and plastid homologues. Int. J. Syst. Evol. Microbiol. 2003;53:1833–1842. [PubMed]
  • Hagen K.D., Meeks J.C., Meeks J.C. The unique cyanobacterial protein OpcA is an allosteric effector of glucose-6-phosphate dehydrogenase in Nostoc punctiforme ATCC 29133. J. Biol. Chem. 2001;276:11477–11486. [PubMed]
  • Harlow T.J., Gogarten J.P., Ragan M.A., Gogarten J.P., Ragan M.A., Ragan M.A. A hybrid clustering approach to recognition of protein families in 114 microbial genomes. BMC Bioinformatics. 2004;5 [PMC free article] [PubMed]
  • Hendy M., Penny M., Penny M. Spectral analysis of phylogenetic data. J. Classif. 1993;10:5–24.
  • Honda D., Yokota A., Sugiyama J., Yokota A., Sugiyama J., Sugiyama J. Detection of seven major evolutionary lineages in cyanobacteria based on the 16S rRNA gene sequence analysis with new sequences of five marine Synechococcus strains. J. Mol. Evol. 1999;48:723–739. [PubMed]
  • Huang J., Xu Y., Gogarten J.P., Xu Y., Gogarten J.P., Gogarten J.P. The presence of a haloarchaeal type tyrosyl tRNA synthetase marks the opisthokonts as monophyletic. Mol. Biol. Evol. 2005;22:2142–2146. [PubMed]
  • Ikeuchi M., Tabata S., Tabata S. Synechocystis sp. PCC 6803—A useful tool in the study of the genetics of cyanobacteria. Photosynth. Res. 2001;70:73–83. [PubMed]
  • Iwai M., Katoh H., Katayama M., Ikeuchi M., Katoh H., Katayama M., Ikeuchi M., Katayama M., Ikeuchi M., Ikeuchi M. Improved genetic transformation of the thermophilic cyanobacterium, Thermosynechococcus elongatus BP-1. Plant Cell Physiol. 2004;45:171–175. [PubMed]
  • Jain R., Rivera M.C., Lake J.A., Rivera M.C., Lake J.A., Lake J.A. Horizontal gene transfer among genomes: The complexity hypothesis. Proc. Natl. Acad. Sci. 1999;96:3801–3806. [PMC free article] [PubMed]
  • Koonin E.V., Makarova K.S., Aravind L., Makarova K.S., Aravind L., Aravind L. Horizontal gene transfer in prokaryotes: Quantification and classification. Annu. Rev. Microbiol. 2001;55:709–742. [PubMed]
  • Kunin V., Goldovsky L., Darzentas N., Ouzounis C.A., Goldovsky L., Darzentas N., Ouzounis C.A., Darzentas N., Ouzounis C.A., Ouzounis C.A. The net of life: Reconstructing the microbial phylogenetic network. Genome Res. 2005;15:954–959. [PMC free article] [PubMed]
  • Lawrence J.G. Gene transfer in bacteria: Speciation without species? Theor. Popul. Biol. 2002;61:449–460. [PubMed]
  • Lawrence J.G., Hendrickson H., Hendrickson H. Genome evolution in bacteria: Order beneath chaos. Curr. Opin. Microbiol. 2005;8:572–578. [PubMed]
  • Lento G.M., Hickson R.E., Chambers G.K., Penny D., Hickson R.E., Chambers G.K., Penny D., Chambers G.K., Penny D., Penny D. Use of spectral analysis to test hypotheses on the origin of pinnipeds. Mol. Biol. Evol. 1995;12:28–52. [PubMed]
  • Lerat E., Daubin V., Moran N.A., Daubin V., Moran N.A., Moran N.A. From gene trees to organismal phylogeny in prokaryotes: The case of the γ-Proteobacteria. PLoS Biol. 2003;1 [PMC free article] [PubMed]
  • Lindell D., Sullivan M.B., Johnson Z.I., Tolonen A.C., Rohwer F., Chisholm S.W., Sullivan M.B., Johnson Z.I., Tolonen A.C., Rohwer F., Chisholm S.W., Johnson Z.I., Tolonen A.C., Rohwer F., Chisholm S.W., Tolonen A.C., Rohwer F., Chisholm S.W., Rohwer F., Chisholm S.W., Chisholm S.W. Transfer of photosynthesis genes to and from Prochlorococcus viruses. Proc. Natl. Acad. Sci. 2004;101:11013–11018. [PMC free article] [PubMed]
  • Lodders N., Stackebrandt E., Nubel U., Stackebrandt E., Nubel U., Nubel U. Frequent genetic recombination in natural populations of the marine cyanobacterium Microcoleus chthonoplastes . Environ. Microbiol. 2005;7:434–442. [PubMed]
  • Mann N.H., Cook A., Millard A., Bailey S., Clokie M., Cook A., Millard A., Bailey S., Clokie M., Millard A., Bailey S., Clokie M., Bailey S., Clokie M., Clokie M. Marine ecosystems: Bacterial photosynthesis genes in a virus. Nature. 2003;424 [PubMed]
  • Millard A., Clokie M.R., Shub D.A., Mann N.H., Clokie M.R., Shub D.A., Mann N.H., Shub D.A., Mann N.H., Mann N.H. Genetic organization of the psbAD region in phages infecting marine Synechococcus strains. Proc. Natl. Acad. Sci. 2004;101:11007–11012. [PMC free article] [PubMed]
  • Miller S.R., Augustine S., Olson T.L., Blankenship R.E., Selker J., Wood A.M., Augustine S., Olson T.L., Blankenship R.E., Selker J., Wood A.M., Olson T.L., Blankenship R.E., Selker J., Wood A.M., Blankenship R.E., Selker J., Wood A.M., Selker J., Wood A.M., Wood A.M. Discovery of a free-living chlorophyll d-producing cyanobacterium with a hybrid proteobacterial/cyanobacterial small-subunit rRNA gene. Proc. Natl. Acad. Sci. 2005;102:850–855. [PMC free article] [PubMed]
  • Mirkin B.G., Fenner T.I., Galperin M.Y., Koonin E.V., Fenner T.I., Galperin M.Y., Koonin E.V., Galperin M.Y., Koonin E.V., Koonin E.V. Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC Evol. Biol. 2003;3 [PMC free article] [PubMed]
  • Mitreva M., Blaxter M.L., Bird D.M., McCarter J.P., Blaxter M.L., Bird D.M., McCarter J.P., Bird D.M., McCarter J.P., McCarter J.P. Comparative genomics of nematodes. Trends Genet. 2005;21:573–581. [PubMed]
  • Morandi A., Zhaxybayeva O., Gogarten J.P., Graf J., Zhaxybayeva O., Gogarten J.P., Graf J., Gogarten J.P., Graf J., Graf J. Evolutionary and diagnostic implications of intragenomic heterogeneity in the 16S rRNA gene in Aeromonas strains. J. Bacteriol. 2005;187:6561–6564. [PMC free article] [PubMed]
  • Nagai T., Ru S., Katoh A., Dong S., Kuwabara T., Ru S., Katoh A., Dong S., Kuwabara T., Katoh A., Dong S., Kuwabara T., Dong S., Kuwabara T., Kuwabara T. Proceedings of 12th International Congress on Photosynthesis. 2001. An extracellular hemolysin homolog from cyanobacterium Synechocystis sp. PCC6803; pp. S36–S10.
  • CSIRO Publishing, Melbourne, Australia. Nakamura Y., Itoh T., Matsuda H., Gojobori T., Itoh T., Matsuda H., Gojobori T., Matsuda H., Gojobori T., Gojobori T. Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat. Genet. 2004;36:760–766. [PubMed]
  • Ochman H., Lawrence J.G., Groisman E.A., Lawrence J.G., Groisman E.A., Groisman E.A. Lateral gene transfer and the nature of bacterial innovation. Nature. 2000;405:299–304. [PubMed]
  • Olendzenski L., Zhaxybayeva O., Gogarten J.P., Zhaxybayeva O., Gogarten J.P., Gogarten J.P. Horizontal gene transfer: A new taxonomic principle? In: Syvanen M., Kado C., Kado C., editors. Horizontal gene transfer. Academic Press; New York: 2002. pp. 427–435.
  • Otero A., Vincenzini M., Vincenzini M. Nostoc (Cyanophyceae) goes nude: Extracellular polysaccharides serve as a sink for reducing power under unbalanced C/N metabolism. J. Phycol. 2004;40:74–81.
  • Partensky F., Hess W.R., Vaulot D., Hess W.R., Vaulot D., Vaulot D. Prochlorococcus, a marine photosynthetic prokaryote of global significance. Microbiol. Mol. Biol. Rev. 1999;63:106–127. [PMC free article] [PubMed]
  • Phylogenetic inference based on matrix representation of trees. Mol. Phylogenet. Evol. 1992;1:53–58. ———. [PubMed]
  • Ragan M.A. Detection of lateral gene transfer among microbial genomes. Curr. Opin. Genet. Dev. 2001;11:620–626. [PubMed]
  • Rippka R., Deruelles J., Waterbury J.B., Herdman M., Stanier R.Y., Deruelles J., Waterbury J.B., Herdman M., Stanier R.Y., Waterbury J.B., Herdman M., Stanier R.Y., Herdman M., Stanier R.Y., Stanier R.Y. Generic assignments, strain histories and properties of pure cultures of cyanobacteria. J. Gen. Microbiol. 1979;111:1–61.
  • Rocap G., Larimer F.W., Lamerdin J., Malfatti S., Chain P., Ahlgren N.A., Arellano A., Coleman M., Hauser L., Hess W.R., Larimer F.W., Lamerdin J., Malfatti S., Chain P., Ahlgren N.A., Arellano A., Coleman M., Hauser L., Hess W.R., Lamerdin J., Malfatti S., Chain P., Ahlgren N.A., Arellano A., Coleman M., Hauser L., Hess W.R., Malfatti S., Chain P., Ahlgren N.A., Arellano A., Coleman M., Hauser L., Hess W.R., Chain P., Ahlgren N.A., Arellano A., Coleman M., Hauser L., Hess W.R., Ahlgren N.A., Arellano A., Coleman M., Hauser L., Hess W.R., Arellano A., Coleman M., Hauser L., Hess W.R., Coleman M., Hauser L., Hess W.R., Hauser L., Hess W.R., Hess W.R., et al. Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiation. Nature. 2003;424:1042–1047. [PubMed]
  • Rudi K., Skulberg O.M., Jakobsen K.S., Skulberg O.M., Jakobsen K.S., Jakobsen K.S. Evolution of cyanobacteria by exchange of genetic material among phyletically related strains. J. Bacteriol. 1998;180:3453–3461. [PMC free article] [PubMed]
  • Schmidt H.A., Strimmer K., Vingron M., von Haeseler A., Strimmer K., Vingron M., von Haeseler A., Vingron M., von Haeseler A., von Haeseler A. TREE-PUZZLE: Maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 2002;18:502–504. [PubMed]
  • Seo P.S., Yokota A., Yokota A. The phylogenetic relationships of cyanobacteria inferred from 16S rRNA, gyrB, rpoC1 and rpoD1 gene sequences. J. Gen. Appl. Microbiol. 2003;49:191–203. [PubMed]
  • Snel B., Bork P., Huynen M.A., Bork P., Huynen M.A., Huynen M.A. Genomes in flux: The evolution of archaeal and proteobacterial gene content. Genome Res. 2002;12:17–25. [PubMed]
  • Sullivan M.B., Coleman M.L., Weigele P., Rohwer F., Chisholm S.W., Coleman M.L., Weigele P., Rohwer F., Chisholm S.W., Weigele P., Rohwer F., Chisholm S.W., Rohwer F., Chisholm S.W., Chisholm S.W. Three Prochlorococcus cyanophage genomes: Signature features and ecological interpretations. PLoS Biol. 2005;3 [PMC free article] [PubMed]
  • Swofford D. PAUP* 4.0 beta version, phylogenetic analysis using parsimony (and other methods). Sinauer Associates, Inc; Sunderland, MA: 1998.
  • Tatusov R.L., Fedorova N.D., Jackson J.D., Jacobs A.R., Kiryutin B., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Fedorova N.D., Jackson J.D., Jacobs A.R., Kiryutin B., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Jackson J.D., Jacobs A.R., Kiryutin B., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Jacobs A.R., Kiryutin B., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Kiryutin B., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Koonin E.V., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Krylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Mekhedov S.L., Nikolskaya A.N., Nikolskaya A.N., et al. The COG database: An updated version includes eukaryotes. BMC Bioinformatics. 2003;4
  • Thompson J.D., Higgins D.G., Gibson T.J., Higgins D.G., Gibson T.J., Gibson T.J. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. [PMC free article] [PubMed]
  • Ting C.S., Rocap G., King J., Chisholm S.W., Rocap G., King J., Chisholm S.W., King J., Chisholm S.W., Chisholm S.W. Cyanobacterial photosynthesis in the oceans: The origins and significance of divergent light-harvesting strategies. Trends Microbiol. 2002;10:134–142. [PubMed]
  • Turner S. Molecular systematics of oxygenic photosynthetic bacteria. Plant Syst. Evol. Suppl. 1997;11:13–52.
  • Turner S., Pryer K.M., Miao V.P., Palmer J.D., Pryer K.M., Miao V.P., Palmer J.D., Miao V.P., Palmer J.D., Palmer J.D. Investigating deep phylogenetic relationships among cyanobacteria and plastids by small subunit rRNA sequence analysis. J. Eukaryot. Microbiol. 1999;46:327–338. [PubMed]
  • Wilmotte A., Herdman M., Herdman M. Phylogenetic relationships among the cyanobacteria based on 16S rRNA sequences. In: Garrity G.M., editor. Bergey's manual of systematic bacteriology. Springer; New York: 2001. pp. 487–493.
  • Wolf Y.I., Aravind L., Grishin N.V., Koonin E.V., Aravind L., Grishin N.V., Koonin E.V., Grishin N.V., Koonin E.V., Koonin E.V. Evolution of aminoacyl-tRNA synthetases—Analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res. 1999;9:689–710. [PubMed]
  • Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. J. Mol. Evol. 1994;39:306–314. [PubMed]
  • Yap W.H., Zhang Z., Wang Y., Zhang Z., Wang Y., Wang Y. Distinct types of rRNA operons exist in the genome of the actinomycete Thermomonospora chromogena and evidence for horizontal transfer of an entire rRNA operon. J. Bacteriol. 1999;181:5201–5209. [PMC free article] [PubMed]
  • Zeidner G., Bielawski J.P., Shmoish M., Scanlan D.J., Sabehi G., Beja O., Bielawski J.P., Shmoish M., Scanlan D.J., Sabehi G., Beja O., Shmoish M., Scanlan D.J., Sabehi G., Beja O., Scanlan D.J., Sabehi G., Beja O., Sabehi G., Beja O., Beja O. Potential photosynthesis gene recombination between Prochlorococcus and Synechococcus via viral intermediates. Environ. Microbiol. 2005;7:1505–1513. [PubMed]
  • Bootstrap, Bayesian probability and maximum likelihood mapping: Exploring new tools for comparative genome analyses. BMC Genomics. 2002;3 ———. [PMC free article] [PubMed]
  • Zhaxybayeva O., Gogarten J., Gogarten J. An improved probability mapping approach to assess genome mosaicism. BMC Genomics. 2003;4 [PMC free article] [PubMed]
  • Zhaxybayeva O., Lapierre P., Gogarten J.P., Lapierre P., Gogarten J.P., Gogarten J.P. Genome mosaicism and organismal lineages. Trends Genet. 2004;20:254–260. [PubMed]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...