![]() | ![]() |
Formats:
|
||||||||||||||||
Copyright © 2009 Le Corguillé et al; licensee BioMed Central Ltd. Plastid genomes of two brown algae, Ectocarpus siliculosus and Fucus vesiculosus: further insights on the evolution of red-algal derived plastids 1CNRS, FR2424, Computer and Genomics Resource Centre, Station Biologique, Roscoff, France 2UPMC Univ. Paris 06, FR2424, Computer and Genomics Resource Centre, Station Biologique, Roscoff, France 3Centre of Marine Sciences, University of Algarve, Marine Ecology and Evolution, Faro, Portugal 4CNRS, UMR7139, Marine Plants and Biomolecules, Station Biologique, Roscoff, France 5UPMC Univ. Paris 06, UMR7139, Marine Plants and Biomolecules, Station Biologique, Roscoff, France 6CEA, DSV, Institut de Génomique, Genoscope, Evry, France 7CNRS, UMR 8030, Evry, France 8Université d'Evry, Evry, France Corresponding author.#Contributed equally. Gildas Le Corguillé: lecorguille/at/sb-roscoff.fr; Gareth Pearson: gpearson/at/ualg.pt; Marta Valente: msvalente/at/ualg.pt; Carla Viegas: caviegas/at/ualg.pt; Bernhard Gschloessl: bernhard.gschloessl/at/univ-rennes1.fr; Erwan Corre: corre/at/sb-roscoff.fr; Xavier Bailly: bailly/at/sb-roscoff.fr; Akira F Peters: akirapeters/at/gmail.com; Claire Jubin: claire.jubin/at/gmail.com; Benoit Vacherie: bvacheri/at/genoscope.cns.fr; J Mark Cock: cock/at/sb-roscoff.fr; Catherine Leblanc: leblanc/at/sb-roscoff.fr Received June 4, 2009; Accepted October 16, 2009. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background Heterokont algae, together with cryptophytes, haptophytes and some alveolates, possess red-algal derived plastids. The chromalveolate hypothesis proposes that the red-algal derived plastids of all four groups have a monophyletic origin resulting from a single secondary endosymbiotic event. However, due to incongruence between nuclear and plastid phylogenies, this controversial hypothesis remains under debate. Large-scale genomic analyses have shown to be a powerful tool for phylogenetic reconstruction but insufficient sequence data have been available for red-algal derived plastid genomes. Results The chloroplast genomes of two brown algae, Ectocarpus siliculosus and Fucus vesiculosus, have been fully sequenced. These species represent two distinct orders of the Phaeophyceae, which is a major group within the heterokont lineage. The sizes of the circular plastid genomes are 139,954 and 124,986 base pairs, respectively, the size difference being due principally to the presence of longer inverted repeat and intergenic regions in E. siliculosus. Gene contents of the two plastids are similar with 139-148 protein-coding genes, 28-31 tRNA genes, and 3 ribosomal RNA genes. The two genomes also exhibit very similar rearrangements compared to other sequenced plastid genomes. The tRNA-Leu gene of E. siliculosus lacks an intron, in contrast to the F. vesiculosus and other heterokont plastid homologues, suggesting its recent loss in the Ectocarpales. Most of the brown algal plastid genes are shared with other red-algal derived plastid genomes, but a few are absent from raphidophyte or diatom plastid genomes. One of these regions is most similar to an apicomplexan nuclear sequence. The phylogenetic relationship between heterokonts, cryptophytes and haptophytes (collectively referred to as chromists) plastids was investigated using several datasets of concatenated proteins from two cyanobacterial genomes and 18 plastid genomes, including most of the available red algal and chromist plastid genomes. Conclusion The phylogenetic studies using concatenated plastid proteins still do not resolve the question of the monophyly of all chromist plastids. However, these results support both the monophyly of heterokont plastids and that of cryptophyte and haptophyte plastids, in agreement with nuclear phylogenies. Background The endosymbiotic captures of free-living prokaryotes, leading to the evolution of two types of organelles, mitochondria and plastids, are considered to be key events in the establishment and success of extant eukaryotic lineages [1,2]. If all mitochondria are likely to be derived from an α-proteobacterium-like ancestor, possibly due to a single and ancient endosymbiotic event, the history of plastid acquisition in the diverse photosynthetic eukaryotic lineages seems to be more complex [3-6]. It is now largely accepted that a single primary endosymbiotic event involving the capture of a cyanobacterium led to an ancestral primary plastid, which subsequently gave rise to the green plastids of the terrestrial plants and chlorophytes, the rhodoplasts of red algae and the cyanelles of the glaucophytes. Once established, primary red or green algal plastids later spread independently to other eukaryote lineages via secondary or tertiary endosymbioses, whereby a photosynthetic eukaryote was engulfed by another eukaryote. Subsequently, plastids have also been independently lost and/or replaced in several eukaryote lineages, making the reconstruction of plastid evolution very difficult. The current consensus of eukaryote phylogeny recognizes six putative super-clusters: Opisthokonta, Amoebozoa, Plantae, Chromalveolata, Rhizaria, and Excavata [7,8], but this division is still debated [9,10]. The three primary plastid-containing lineages, Viridiplantae, Rhodophyta and Glaucophyta form the "Plantae" or "Archaeplastida" supergroup. Photosynthetic eukaryotes with secondary or tertiary plastids have evolved independently in the Chromalveolata, Rhizaria, and Excavata [3,5]. Among the secondary plastids, chlorophyll c-containing plastids have been shown to be derived from an ancestral red alga via a secondary endosymbiotic process that took place around one billion years ago [11,12]. This type of plastid is found in Cryptophyta, Haptophyta, Heterokonta (also called stramenopiles) and Dinophyceae algae [3,4]. Cryptophyta, Haptophyta and Heterokonta eukaryotic lineages have been grouped under the name of "Chromista" by Cavalier-Smith [13], and were later associated with the Alveolata, which includes the apicomplexans, dinoflagellates and ciliates, to form the "Chromalveolata" supergroup. In 1999, Cavalier-Smith proposed that all the chlorophyll c-containing plastids were derived from a single secondary endosymbiotic event and that the common ancestor of chromalveolates was originally photosynthetic [14]. During diversification of the four extant chromalveolates lineages, photosynthetic capacity and/or the plastid organelle would then have been independently lost several times in different eukaryotic lineages, such as oomycetes (non-photosynthetic heterokonts), apicomplexa or ciliates (non-photosynthetic alveolates). According to this so-called "chromalveolate" hypothesis, plastid and nuclear genomes have similar evolutionary histories and one would expect monophyly of chromalveolate lineages in both nuclear and plastid phylogenies. This hypothesis has been extensively debated over the last ten years (for recent references, [5,6,15-17]), in part because of incongruence between plastid and nuclear phylogenies [9]. At the nuclear level, both the monophyly of heterokonts and alveolates and that of cryptophytes and haptophytes have received increasing support in recent years (for recent review and references therein, [6]). Two contemporary phylogenetic analyses based on expressed sequences tag surveys of the cryptomonad Guillardia theta and the haptophyte Emiliania huxleyi supported the close relationship of cryptophyte and haptophyte host lineages [18,19]. In nuclear phylogenies alveolates and heterokonts often form a sister group [9,20]. Unexpectedly, several large scale nuclear phylogenies have also shown a very robust relationship between members of Rhizaria, cercozoans, and these two main clades of the "chromalveolates", but with the exclusion of haptophytes and cryptophytes [18,21,22]. The debate is becoming more complex with the emergence of this new putative SAR (stramenopiles/alveolata/rhizaria) supergroup, as proposed by Burki [23]. Recent phylogenetic studies employing large gene- and taxon-rich datasets continue to question the reality of the "chromalveolate" supergroup, by placing the haptophyte-cryptophyte clade as a sister group to the Plantae [24,25] or by having them emerging independently and separately from the SAR supergroup [10]. It is however well known that reconstructing the evolution of host cell lineages can be difficult, especially because of the chimeric nature of nuclear genomes and because large-scale horizontal gene transfers have occurred in some lineages during evolution [26]. Plastid genomes are less affected by horizontal gene transfer, with some rare exceptions [27]. At the plastid level, the monophyly of chromist plastids is supported by analyses of single genes [28], of small numbers of concatenated plastid genes [12,29], and of larger datasets of plastid-associated genes, i.e. plastid and nuclear-encoded plastid-targeted genes [30-35]. The relationships among chlorophyll c-containing plastids are, however, particularly hard to resolve and the results obtained are sometimes incongruent with host cell phylogenies [9]. Haptophyte plastid genes more often group with the heterokont/dinoflagellate clade, than with those of cryptophytes [30,31,33,34]. A clade grouping haptophyte and cryptophyte species has been inferred from some plastid gene phylogenies [31,33-35]. This clustering was not strongly supported and was highly dependent on the plastid gene dataset used [31,35] and/or on taxon-sampling [33,34]. Other variant topologies have included the placing of dinoflagellates either as a sister-group to haptophyte plastids [30,33] or to heterokont plastids [34,35]. However, a close evolutionary relationship between haptophyte and cryptophyte plastids would be consistent with the presence of a unique laterally transferred bacterial rpl36 gene in both plastid genomes [27]. Other multigene analyses produced alternative results, such as low support for the chromist clade [29] or paraphyly of red-algal derived plastids [35,36]. The inability to recover congruencies between plastid and nuclear phylogenies, especially concerning haptophyte and cryptophyte monophyly, may be explained by poor taxon sampling of red algal and chromist species [31,36]. Until now, insufficient sequence data have been available for the chromalveolates, in terms of both nuclear and plastid genome sequences. In public databases, more than 110 complete plastid genomes are available from land plants and green algae, whereas less than 15 sequences belong to red algae or photosynthetic chromalveolate species. Only five complete plastid sequences have been reported for red algal species [36-39]. For the chromalveolates, with the exception of the highly diverged red-algal derived plastid genomes of non-photosynthetic apicomplexans [40] and those of dinoflagellates [41,42], complete plastid sequences have been published for two cryptomonads, Guillardia theta and Rhodomonas salina [11,31], one haptophyte, Emiliania huxleyi [43], 3 diatoms, Odontella sinensis, Phaeodactylum tricornutum and Thalassiosira pseudonana [44,45], one raphidophyte Heterosigma akashiwo [46] and one xanthophyte Vaucheria litorea [47]. Here we report the complete sequences of the plastid genomes of Ectocarpus siliculosus and Fucus vesiculosus. These sequences represent the first fully characterized plastid genomes from two distinct orders of Phaeophyceae, namely Ectocarpales and Fucales [48]. We have performed phylogenetic studies using large sets of genes and different reconstruction methods. The results still do not resolve the question of the monophyly of chromist plastids. However the topologies of concatenated plastid protein phylogenetic trees support both the monophyly of heterokont plastids and that of cryptophyte and haptophyte plastids, in agreement with nuclear phylogenies. Results Structure and gene content of the phaeophyte plastid genomes The plastid genomes of E. siliculosus and F. vesiculosus are 139,954 and 124,986 base pairs (bp) in size, respectively, and both contain two inverted repeat regions (IR). These IRs divide the circular molecules into large (LSC) and small single copy (SSC) regions (Figure (Figure11
Gene organisation is highly similar between the two genomes and around two thirds of both molecules are conserved with respect to both gene identity and order. About 50% of each genome is incorporated into two large, locally collinear blocks. One block contains a large proportion of ribosomal protein-coding genes and covers up to 24% of the plastid genomes. The second block extends between trnM and atpA and covers 26-27.5% of each genome (Figure (Figure11 The two plastid genomes are also very similar in terms of total gene content (Table 1). As already found in most of the green and red photosynthetic plastid genomes, excluding those of dinoflagellates [43], they possess the common core set of 44 genes, but with the exception of the psbZ gene (listed in additional file 2, Table S3). They also contain 42 additional protein-coding genes, which are only found in red algal and chromist plastid genomes, giving a total of 86 genes that are shared with the red plastid lineage (Table 1). These genes mainly encode essential plastid proteins, involved in transcription, protein synthesis and transport, and photosynthetic metabolism, such as components of ATP synthase, cytochrome, photosystem I and II complexes. Nine genes are shared by all the chromist plastid genomes, but not with all the red algal plastid genomes (Table 1). Another 27 genes are encoded by most heterokont plastid genomes, but are not consistently present in the plastid genomes of haptophytes, cryptophytes and red algae. Of the 17 remaining genes that are common to E. siliculosus, F. vesiculosus and V. litorea cpDNAs, nine are present in the raphidophyte plastid genome, but all are absent from the diatom cpDNAs (Table 1).
Among the unknown plastid proteins, the conserved open reading frames (ORFs) Ectocarpus Escp124 and Fucus ORF76 encode putative proteins of 222 and 229 amino-acids, with 48% identity between species. Both protein sequences are predicted to possess five transmembrane helices. A homolog of these plastid proteins is also encoded by the plastid genome of the xanthophyte V. litorea. Interestingly, the most similar protein in the public databases is a nuclear-encoded protein, Tic20, found in several apicomplexa species, including Toxoplasma and Plasmodium. The C-terminal ends of these proteins also share weak similarity with the conserved hypothetical plastid proteins encoded by the ycf60 genes of plastid genomes from E. huxleyi, G. tenuistitipata and Cyanidiales (see partial multiple alignment provided in additional file 3, Figure S2). Phylogenetic analyses For phylogenetic analyses, three concatenated amino acid datasets were constructed (see additional file 2, Table S3) and analysed using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI) methods. For the ML analyses, cpREV and JTT amino acid substitution matrices gave the same tree topologies (data not shown). Trees were constructed using a dataset of 44 proteins (8,652 amino-acid positions) from a broad range of species, including 13 taxa of red-algal type plastids, 4 taxa of Viriplantae, the glaucophyte Cyanophora, and two cyanobacteria (see additional file 2, Table S4 for species list). Plastid sequences of chlorophyll-c-containing dinoflagellates were not included in the analyses because this would have resulted in a significantly reduced common protein dataset. All but four of the nodes in the trees were well resolved and supported by the three different methods (Figure (Figure3).3
To strengthen the topology of branching in the region of the tree corresponding to the red-alga derived plastids, we decided to increase the protein dataset by focusing the phylogenetic studies on 13 species. A full dataset of 83 plastid-encoded proteins (16,738 amino acid positions) was analyzed in parallel with a sub-dataset of 33 slowly-evolving plastid proteins, excluding the fast-evolving proteins (Figure (Figure4).4
To further test these phylogenetic positions, we compared different topologies by performing the approximately unbiased (AU) and Shimodaira-Hasegawa (SH) tests (Figure (Figure5).5
Discussion Monophyly and evolution of heterokont plastid genomes Until very recently, all of the plastid genomes available for the heterokont lineage were from diatoms (O. sinensis, P. tricornutum and T. pseudonana), and these genomes featured conserved gene content and gene clusters [45]. Along with the recently published plastid genomes of two strains of the raphidophyte H. akashiwo [46]) and the xanthophyte V. litorea [47], the complete sequences of the E. siliculosus and F. vesiculosus plastid genomes presented here significantly increase the number and diversity of heterokont plastid genomes available, allowing a more extensive comparison of these genomes. Our results support a unique origin for all heterokont plastids, based on similarity in terms of gene content (Table 1) and on their forming a strongly supported group in all our phylogenetic analyses (Figures (Figures33 All the Xanthophyceae or Phaeophyceae plastid genomes analyzed to date, including that of F. vesiculosus described here, contain a tRNA-Leu gene with a single intron [47,50]. This canonical group I intron is thought to have been acquired from the ancestral cyanobacterial endosymbiont and to have been lost independently in several lineages of plastids, including the red algae and almost all their secondary plastid derivatives, except the Xanthophyceae/Phaeophyceae lineage [50]. Given the high sequence similarities found between these plastid tRNA-Leu genes in V. litorea, F. vesiculosus and E. siliculosus (86 to 93% sequence identity), they are probably derived from the same ancestral tRNA-Leu gene, containing the endosymbiotic derived intron. In the E. siliculosus gene, its loss is likely to be recent because it is still present in the plastid tRNA-Leu genes of Laminariales species and of two Ectocarpales, Pylaiella littoralis and Scytosiphon lomentaria (Figure (Figure2)2 In terms of gene content, the brown algal plastid genomes seem to be more closely related to those of V. litorea and of H. akashiwo than to those of diatoms and this is consistent with evolutionary relationships of the nuclear compartment [51,52]. Although the structural organisation of plastid genomes is highly conserved within the brown algae (additional file 1, Figure S1) and within diatoms [45], there is evidence of intensive gene rearrangements having occurred earlier in evolution after the separation of diatoms from raphidophytes, xanthophytes and phaeophytes. Moreover, more extensive gene losses seem to have occurred in diatom plastid genomes than in other heterokonts (Table 1). These genes could have been transferred to the nucleus or replaced by bacterial counterparts, functionally-integrated through horizontal gene transfer as often seen in the diatom nuclear genome [53]. All these data, together with the topologies of plastid phylogenetic trees (Figure (Figure33 What is the closest relative of the heterokont plastid clade? A critical step for the transformation of the endosymbiont into a permanent organelle was the establishment of an efficient protein targeting and translocation system from the nucleus to the plastid [1,4]. The canonical Tic/Toc protein import complex of secondary plastids was inherited from the first red-algal endosymbiont, with components of both eukaryotic and eubacterial origin [1,54,55]. Both brown algal plastid genomes have a gene (Escp124 in Ectocarpus and ORF76 in Fucus) that shares similarity with the Tic20-like genes in xanthophyte, haptophyte and red algal plastid genomes. There are no homologues of this gene in raphidophyte, diatom and cryptophyte plastid genomes (Table 1). This plastid-encoded Tic20 gene (also called ycf60) encodes a small membrane protein and is thought to be endosymbiont-derived with a cyanobacterial origin [1,54,55]. Interestingly, the highest similarity scores of brown algal and xanthophyte plastid ORFs were found with a homologous protein encoded in the nucleus of several apicomplexan species, including Toxoplasma and Plasmodium. In T. gondii, this Tic20-like protein has been shown to be essential for protein import into the apicoplast [56] and is therefore likely to be linked to apicoplast evolution [4]. Escp124 and ORF76 protein sequences are also predicted to have five transmembrane regions, suggesting a putative location in the plastid membrane. It is now widely accepted that alveolates and heterokonts are derived from a common host cell ancestor. Escp124 and ORF76 could be footprints of a common photosynthetic ancestor of heterokonts and apicomplexans. This hypothesis is in agreement with several recently published studies suggesting that contemporary alveolates are derived from a photosynthetic ancestor. These studies include the characterization of a photosynthetic alveolate closely related to apicomplexan parasites [57], the identification of plastid-derived genes in a non-photosynthetic alveolate [58] and the identification of remnant algal-related genes in ciliates [59]. Is the monophyly of chromist plastids still in doubt? All the phylogenetic analyses carried out in this study suggest that the red algal ancestor of chromist plastids was more closely related to the more recently evolved red algae (Florideophyceae and Bangiophyceae) than to Cyanidiales, confirming the report by Sanchez-Puerta et al. [33]. It is worth mentioning that Cyanidiales are extremophile unicellular red algae and have been shown to be the earliest diverging red algal group. They emerge very distinctly from the other multi-cellular red algal taxa in nuclear phylogenies [29]. Within the chromist plastid clade, most plastid phylogenies have hitherto featured a clade grouping haptophyte and heterokont plastids [29,30] and the relationship between haptophyte and cryptophyte plastids was never strongly recovered in previous studies [31,33-35]. These conflicting results have been discussed in the light of taxon- or data-sampling limitations [31,34]. Our results do not support a preferential link between heterokont and haptophyte plastids, neither in terms of gene content (Table 1) nor phylogenetic relationship. Moreover, these phylogenetic analyses strongly support the monophyly of haptophyte and cryptophyte plastids (Figure (Figure4).4 Horizontal gene transfers into plastid genomes happened only rarely after the establishment of the endosymbiont within the host cell. The major events which can affect the structure of the organelle genome are gene transfer to the nucleus and/or gene loss. Indeed, red algal plastid genomes possess more than 230 protein-coding genes while those derived from a red-algal endosymbiont encode less than 150, of which more than half are shared by all the genomes (Table 1). An exceptional case is the drastic reduction of plastid minicircular genomes of peridinean dinoflagellates [41]. In other plastid genomes derived from a red algal endosymbiont, the remaining pool of genes is the result of losses that have occurred independently in the different lineages and of retention that could constitute interesting fingerprints of ancestral plastid gene contents. A comparison of gene content did not reveal any particular relationships between heterokonts and cryptophytes/haptophytes and therefore did not provide support for a common history. For the phylogenetic analyses, whereas the use of the complete dataset supported a different red-algal origin for heterokont plastids (Figure (Figure4A),4A It has been shown that plastid metabolism could also involve a significant number of nuclear-encoded proteins recruited from diverse origins, such as laterally transferred genes from Chlamydiae [62] or green algae [63-65]. Phylogenies based on nuclear-encoded plastid-targeted proteins could then trace and reflect complex evolutionary pathways, whereas phylogenies based on complete sets of plastid-encoded genes should better reflect the evolution of the organelle since its engulfment by the host cell. As illustrated by the high resolution of the heterokont plastid clade, additional plastid genomes from haptophytes, cryptophytes and dinoflagellates, but certainly also from other evolved red algae will be required to fully resolve chromist plastid phylogenies and, subsequently, test the different hypotheses concerning red-algal derived plastid origin(s). Conclusion In conclusion, this study of two novel plastid genomes belonging to brown algal species has shown the importance of increased taxon sampling when analysing phylogenetic relationships based on large datasets. As expected, the phylogenetic analyses showed that heterokont plastids are monophyletic, although very diverse in terms of gene arrangement. There is also evidence that some heterokont (phaeophyte and xanthophyte) plastids have retained finger-prints indicating a common ancestory with alveolate plastids. Moreover, monophyly of haptophyte and cryptophytes plastids was strongly recovered whatever the dataset or the method used, in complete agreement with large-scale nuclear phylogenies. Methods Algal material and DNA extraction E. siliculosus strain Ec32 (CCAP1310/4) was cultivated under laboratory conditions as previously described [66] and total DNA was prepared according to the method of Apt et al. [67]. F. vesiculosus was collected from the field (Ria Formosa Natural Park, Portugal) and DNA was extracted from isolated plastids. Briefly, 20 g apical tissue free from visible epiphytes was cleaned by 2 min exposure in bleach (1% in filtered natural seawater), rinsed and homogenized in 100 mL cold extraction buffer containing 0.05 M MES (pH 6.1), 0.5 M sorbitol, 1 mM MgCl2, 1 mM MnCl, 0.5 mM K2HPO4, 5 mM EDTA, 1% BSA, 2% PVP, and 2 mM Na-ascorbate. The homogenate was passed through cotton gauze and 1 μm nylon mesh, centrifuged for 2 min at 2000 × g at 4°C. The supernatant was transferred to new 50 mL tubes and centrifuged at 5000 × g for 5 min. The pellet containing plastids was gently resuspended in a total of 10 mL of extraction buffer and re-centrifuged (5 min, 5000 × g, 4°C). The pellet was resuspended in new extraction buffer and applied to a 30:50% sucrose step gradient. After centrifugation for 45 min at 5000 × g (4°C), the plastids were removed from the 30 and 50% sucrose interface, carefully resuspended in a buffer containing 0.05 M HEPES (pH 7.5), 0.5 M sorbitol, 1 mM MgCl2, 1 mM MnCl, 0.5 mM K2HPO4. After observation under the microscope to determine the quality of the plastid preparation, plastids were centrifuged again for 10 min at 5000 × g. The supernatant was removed and plastids were stored at -80°C prior to DNA extraction using the CTAB method [68]. Genome Sequencing, Assembly and Annotation For E. siliculosus, several scaffolds corresponding to plastid DNA were detected by similarity to other plastid genomes in an assembly of shotgun sequenced total genomic DNA produced by Genoscope http://www.genoscope.cns.fr/spip/-Ectocarpus-siliculosus-.html. These scaffolds were removed from the rest of the sequence data and the sequence of the circular genome was completed by manual assembly and PCR amplification of gap regions. The plastid genome was annotated using the GenDB interface [69], available through the bioinformatics' facilities of the Marine Genomics Europe Network of Excellence. For F. vesiculosus, two main strategies were used to obtain the full genome sequence: 1) Plastid-enriched DNA (cpDNA) was digested (HindIII), and cloned into pBluescript II (SK-) (Stratagene). Positive colonies were randomly picked and those with inserts > 1 Kb after digestion were end-sequenced. 2) Plastid DNA was used to make uncloned, adaptor-ligated libraries for a genome-walking approach using long-distance PCR (GenomeWalker⃦ kit, Clontech, Palo Alto, USA). Gaps in the genome were filled by PCR, based on predicted gene organization in red-lineage plastids. The F. vesiculosus plastid genome was assembled using CodonCode Aligner (CodonCode Corp., USA). Protein coding genes and putative open reading frames (ORFs) were identified by database comparison (Blastx, [70]) and online tools (ORF Finder, NCBI). Ribosomal and tRNA genes were identified using RNAmmer http://www.cbs.dtu.dk/services/RNAmmer/[71] and ARAGORN http://130.235.46.10/ARAGORN/[72], respectively. The two plastid sequences are available under the following EMBL accession numbers: E. siliculosus (FP102296) and F. vesiculosus (FM957154). The physical maps of the circular genome were drawn using GenomeVx (freely available at wolfe.gen.tcd.ie/GenomeVx/). Phylogenetic Analyses For global gene content comparisons, the two brown algal plastid genomes were analysed together with those of the xanthophyte V. litorea [47] and the raphidophyte H. akashiwo [46] plus the 15 algal sequences and the two reference cyanobacterium genomes analysed by Khan et al. [31]. The phylogenetic analyses were conducted with a total of two cyanobacterium and 18 plastid genomes, including four complete genomes from red algae and nine from chromist species (see additional file 2, Table S4). Three concatenated protein datasets were constructed from these genomes (additional file 2, Table S3). The first dataset corresponded to the 44 plastid protein-coding genes shared by all 20 species. In addition, a larger dataset of 83 proteins was built using all the plastid proteins common to the 13 red, cryptophyte, haptophyte and heterokont algae. A list of gene synonyms used during this study is provided in additional file 2 (Table S5), together with complementary gene annotation information. Single and concatenated protein sequences were aligned using MUSCLE [73] and each alignment was further optimised using GBlocks [74]. Datasets for individual genes were first analysed using maximum likelihood, in order to eliminate genes derived from horizontal transfer. Only the rpl36 protein phylogeny suggests a non red-algal origin for the haptophyte and cryptophyte genes, which grouped far outside the red algal and heterokont cluster, as previously reported [27]. This gene was therefore eliminated from the full 83-protein dataset. The average distance was calculated for each protein with Tree-Puzzle [75]. We excluded 50 "fast-evolving" protein sequences to produce a dataset of 33 "slowly-evolving" proteins, which present an average distance under the threshold of 0.6. This value was chosen in order to conserve at least half of the analysed positions for the 33-protein dataset. Phylogenetic analyses of concatenated protein data were carried out on 8,652, 16,738 and 8,404 amino acids corresponding, respectively, to the 44-, 83- and 33-protein datasets. A Maximum Likelihood (ML) approach was used to reconstruct phylogenetic trees using PHYML [76] under both cpREV [77] and JTT [78] amino acid substitution matrices with 4 gamma-distributed rate categories and estimated invariable sites. The neighbor-joining (NJ) method was performed with JTT amino acid substitution matrix using the Phylip software package [79]. For both the ML and NJ methods, bootstrap analyses of 1,000 replicates were used to provide confidence estimates for the phylogenetic tree topologies. Finally, Bayesian inference (BI) analyses were performed with PhyloBayes 3.1d [80] using 4 gamma-distributed rate categories. PhyloBayes was run using the site-heterogeneous CAT model as described in Lartillot et al. [81] and two independent chains with a total length up to 25,000 cycles, discarding the first 25% as burn-in and calculating the posterior consensus tree. Furthermore, a saturation test was performed on the different datasets to calculate the observed and predicted homoplasy rates as described in the PhyloBayes user manual. To statistically test the topologies of the trees, approximately unbiased (AU) and Shimodaira-Hasegawa (SH) analyses were performed on four topologies. These were selected to reflect the relative positions of haptophyte, cryptophyte and heterokont plastids and were generated by rearrangement of ML and NJ trees (if required). Site likelihoods for each topology were calculated using Tree-Puzzle on the two different concatenated datasets and the AU/SH tests were performed using CONSEL 0.1 [82]. Authors' contributions GLC, BG, CL annotated the E. siliculosus cpDNA. GLC, CL carried out the phylogenetic analysis. GP, MV, CV sequenced and assembled the cpDNA of F. vesiculosus. GP annotated the F. vesiculosus genome. GLC, GP, CL performed the comparative genomic analyses on both plastid genomes. AFP obtained and provided E. siliculosus cultures. CJ, BV sequenced and provided plastid contigs of E. siliculosus. EC, XB participated in design of phylogenetic and statistical approaches. GLC, GP, JMC contributed to manuscript writing. JMC helped to supervise the project. CL conceived and designed the project, wrote the manuscript. All authors read and approved the final manuscript. Additional file 1 Additional data and analyses on the plastid genomes of E. siliculosus and F. vesiculosus. Tables S1 and S2, general features of the two phaeophyte plastid genomes and reversal distance matrix for pairwise comparisons between heterokont plastid genomes. Figure S1, MAUVE genome comparison between E. siliculosus and F. vesiculosus plastid genomes. Click here for file(105K, PDF) Additional file 2 Genes and genomes used in the phylogenetic studies. Tables S3 to S5, lists of genes, accession numbers of plastid and bacterial genomes and complementary information about gene synonyms and about some protein-encoded genes used in the phylogenetic studies. Click here for file(25K, PDF) Additional file 3 Partial multiple alignment of Tic20 and yfc60 protein homologs. Figure S2 showing partial multiple alignment of Tic20 and yfc60 protein homologs from red alga-derived plastid and apicomplexan genomes. Click here for file(96K, PDF) Acknowledgements We are grateful to Alexander Goesmann and Virginie Mittard-Runte for providing access to the GenDB platform and to Hameed Khan and John M. Archibald for providing their 45 concatenated-protein alignment. We also thank Nicolas Lartillot for making available the last version of Phylobayes 3.1d. This work, performed within the framework of Marine Genomics Europe NoE 7 (EC contract N° GOCE-CT-2004-505403), was partially supported by the Brittany Regional Council (G. L. C. grant) and by FCT-FEDER (Portugal). References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||
Science. 2004 Apr 9; 304(5668):253-7.
[Science. 2004]Bioessays. 2007 Dec; 29(12):1239-46.
[Bioessays. 2007]Annu Rev Genet. 2007; 41():147-68.
[Annu Rev Genet. 2007]J Eukaryot Microbiol. 2009 Jan-Feb; 56(1):1-8.
[J Eukaryot Microbiol. 2009]J Eukaryot Microbiol. 2005 Sep-Oct; 52(5):399-451.
[J Eukaryot Microbiol. 2005]PLoS Genet. 2006 Dec; 2(12):e220.
[PLoS Genet. 2006]BMC Evol Biol. 2008 Jan 18; 8():14.
[BMC Evol Biol. 2008]Annu Rev Genet. 2007; 41():147-68.
[Annu Rev Genet. 2007]Curr Biol. 2009 Jan 27; 19(2):R81-8.
[Curr Biol. 2009]J Eukaryot Microbiol. 2009 Jan-Feb; 56(1):1-8.
[J Eukaryot Microbiol. 2009]Mol Biol Evol. 2007 Aug; 24(8):1702-13.
[Mol Biol Evol. 2007]Curr Biol. 2007 May 15; 17(10):887-91.
[Curr Biol. 2007]PLoS Genet. 2006 Dec; 2(12):e220.
[PLoS Genet. 2006]Science. 2000 Nov 3; 290(5493):972-7.
[Science. 2000]BMC Biol. 2006 Sep 6; 4():31.
[BMC Biol. 2006]Mol Biol Evol. 2003 Oct; 20(10):1730-5.
[Mol Biol Evol. 2003]Proc Natl Acad Sci U S A. 2002 Nov 26; 99(24):15507-12.
[Proc Natl Acad Sci U S A. 2002]Mol Biol Evol. 2004 May; 21(5):809-18.
[Mol Biol Evol. 2004]Mol Biol Evol. 2005 Sep; 22(9):1772-82.
[Mol Biol Evol. 2005]Mol Biol Evol. 2007 Aug; 24(8):1832-42.
[Mol Biol Evol. 2007]J Mol Evol. 2004 Oct; 59(4):464-77.
[J Mol Evol. 2004]DNA Res. 2003 Apr 30; 10(2):67-77.
[DNA Res. 2003]Curr Issues Mol Biol. 2005 Jan; 7(1):57-79.
[Curr Issues Mol Biol. 2005]Nature. 1999 Jul 8; 400(6740):155-9.
[Nature. 1999]DNA Res. 2005; 12(2):151-6.
[DNA Res. 2005]Mol Biol Evol. 2007 Mar; 24(3):723-31.
[Mol Biol Evol. 2007]Proc Natl Acad Sci U S A. 2008 Nov 18; 105(46):17867-71.
[Proc Natl Acad Sci U S A. 2008]BMC Evol Biol. 2008 Jan 18; 8():14.
[BMC Evol Biol. 2008]Science. 2007 Jan 12; 315(5809):253-5.
[Science. 2007]PLoS One. 2007 Aug 29; 2(8):e790.
[PLoS One. 2007]Proc Natl Acad Sci U S A. 2008 Nov 18; 105(46):17867-71.
[Proc Natl Acad Sci U S A. 2008]J Mol Evol. 2003 Dec; 57(6):710-20.
[J Mol Evol. 2003]Protist. 2009 May; 160(2):191-204.
[Protist. 2009]Protist. 2009 May; 160(2):191-204.
[Protist. 2009]Protist. 2008 Jul; 159(3):435-57.
[Protist. 2008]Nature. 2008 Nov 13; 456(7219):239-44.
[Nature. 2008]Science. 2004 Apr 9; 304(5668):253-7.
[Science. 2004]Annu Rev Plant Biol. 2008; 59():491-517.
[Annu Rev Plant Biol. 2008]Genetics. 2008 May; 179(1):95-112.
[Genetics. 2008]Trends Plant Sci. 2009 Jan; 14(1):13-20.
[Trends Plant Sci. 2009]Proc Natl Acad Sci U S A. 2008 Sep 9; 105(36):13574-9.
[Proc Natl Acad Sci U S A. 2008]Mol Biol Evol. 2004 May; 21(5):809-18.
[Mol Biol Evol. 2004]Mol Biol Evol. 2005 Sep; 22(9):1772-82.
[Mol Biol Evol. 2005]Mol Biol Evol. 2007 Aug; 24(8):1832-42.
[Mol Biol Evol. 2007]J Mol Evol. 2008 Feb; 66(2):175-84.
[J Mol Evol. 2008]Mol Biol Evol. 2005 May; 22(5):1337-44.
[Mol Biol Evol. 2005]Nature. 1999 Jul 8; 400(6740):155-9.
[Nature. 1999]J Mol Evol. 2004 Oct; 59(4):464-77.
[J Mol Evol. 2004]Mol Biol Evol. 2007 Aug; 24(8):1832-42.
[Mol Biol Evol. 2007]J Mol Evol. 2008 Feb; 66(2):175-84.
[J Mol Evol. 2008]Curr Biol. 2009 Jan 27; 19(2):R81-8.
[Curr Biol. 2009]PLoS One. 2008 May 21; 3(5):e2205.
[PLoS One. 2008]Mol Biol Evol. 2006 Mar; 23(3):663-74.
[Mol Biol Evol. 2006]Mol Biol Evol. 2008 Dec; 25(12):2653-67.
[Mol Biol Evol. 2008]Mol Gen Genet. 1995 Feb 20; 246(4):455-64.
[Mol Gen Genet. 1995]Nucleic Acids Res. 2003 Apr 15; 31(8):2187-95.
[Nucleic Acids Res. 2003]J Mol Biol. 1990 Oct 5; 215(3):403-10.
[J Mol Biol. 1990]Nucleic Acids Res. 2007; 35(9):3100-8.
[Nucleic Acids Res. 2007]Nucleic Acids Res. 2004; 32(1):11-6.
[Nucleic Acids Res. 2004]Proc Natl Acad Sci U S A. 2008 Nov 18; 105(46):17867-71.
[Proc Natl Acad Sci U S A. 2008]Mol Biol Evol. 2007 Aug; 24(8):1832-42.
[Mol Biol Evol. 2007]Nucleic Acids Res. 2004; 32(5):1792-7.
[Nucleic Acids Res. 2004]Mol Biol Evol. 2000 Apr; 17(4):540-52.
[Mol Biol Evol. 2000]BMC Biol. 2006 Sep 6; 4():31.
[BMC Biol. 2006]Syst Biol. 2003 Oct; 52(5):696-704.
[Syst Biol. 2003]J Mol Evol. 2000 Apr; 50(4):348-58.
[J Mol Evol. 2000]Comput Appl Biosci. 1992 Jun; 8(3):275-82.
[Comput Appl Biosci. 1992]Mol Biol Evol. 2004 Jun; 21(6):1095-109.
[Mol Biol Evol. 2004]BMC Evol Biol. 2007 Feb 8; 7 Suppl 1():S4.
[BMC Evol Biol. 2007]Bioinformatics. 2001 Dec; 17(12):1246-7.
[Bioinformatics. 2001]