![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2006 Cui et al; licensee BioMed Central Ltd. Adaptive evolution of chloroplast genome structure inferred using a parametric bootstrap approach 1Department of Biology, Institute of Molecular Evolutionary Genetics, and Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA 2Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA 3Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA 4Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA Corresponding author.Liying Cui: liying/at/psu.edu; Jim Leebens-Mack: jhl10/at/psu.edu; Li-San Wang: lswang/at/med.upenn.edu; Jijun Tang: jtang/at/cse.sc.edu; Linda Rymarquis: lar24/at/cornell.edu; David B Stern: ds28/at/cornell.edu; Claude W dePamphilis: cwd3/at/psu.edu Received June 15, 2005; Accepted February 9, 2006. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Background Genome rearrangements influence gene order and configuration of gene clusters in all genomes. Most land plant chloroplast DNAs (cpDNAs) share a highly conserved gene content and with notable exceptions, a largely co-linear gene order. Conserved gene orders may reflect a slow intrinsic rate of neutral chromosomal rearrangements, or selective constraint. It is unknown to what extent observed changes in gene order are random or adaptive. We investigate the influence of natural selection on gene order in association with increased rate of chromosomal rearrangement. We use a novel parametric bootstrap approach to test if directional selection is responsible for the clustering of functionally related genes observed in the highly rearranged chloroplast genome of the unicellular green alga Chlamydomonas reinhardtii, relative to ancestral chloroplast genomes. Results Ancestral gene orders were inferred and then subjected to simulated rearrangement events under the random breakage model with varying ratios of inversions and transpositions. We found that adjacent chloroplast genes in C. reinhardtii were located on the same strand much more frequently than in simulated genomes that were generated under a random rearrangement processes (increased sidedness; p < 0.0001). In addition, functionally related genes were found to be more clustered than those evolved under random rearrangements (p < 0.0001). We report evidence of co-transcription of neighboring genes, which may be responsible for the observed gene clusters in C. reinhardtii cpDNA. Conclusion Simulations and experimental evidence suggest that both selective maintenance and directional selection for gene clusters are determinants of chloroplast gene order. Background The influence of genotype on phenotype is not limited to the coding of peptides and functional RNAs by nucleotide sequences. An organism's phenotype is also affected by the chromosomal arrangement of genes and the interaction of gene products. Comparative genomics has revealed a variety of gene clusters and chromosomal segments that have remained intact over hundreds of millions of years [1]. Selection for clustering of co-transcribed genes has been hypothesized to influence gene order within bacterial and organelle genomes, where gene clusters typically encode multiple components of a functional pathway [2]. For example, the ribosomal proteins are encoded by similar operons in archaebacteria, eubacteria and plastids [3]. In eukaryotic genomes, co-expression of neighboring genes is significantly associated with the functional roles of the genes (such as housekeeping genes or genes in the same metabolic pathway) [4,5]. One way that those genes become clustered is through tandem duplication, which usually results in functionally related genes being adjacent. On the other hand, unrelated genes may also be brought together through chromosome rearrangements (recombination, inversion and transposition). Unless selection is acting to maintain or promote gene clusters, gene orders in genomes subjected to rearrangements should become randomized with respect to function or co-expression profiles. Significant clustering has been inferred using permutation tests that compare observed physical distances between pairs or blocks of co-expressed or functionally related genes to a null distribution constructed from randomized gene orders [4,5]. However, this approach is limited since the evolutionary history of the genome was not considered. When comparing gene orders among related species, it is possible to estimate the ancestral genome and to simulate a null distribution for changes in gene order using a model. This evolutionary approach can be used to test directly the influence of selection on genome structure, that is, whether present-day genome structure has been influenced by directional selection for clustering of functionally related genes. Small genomes, especially those of organelles and bacteria, are well suited to global comparisons of gene order. Like eukaryotic genomes, they are subject to structural changes such as inversion, transposition or translocation, as well as gene loss and (more rarely) gene gain. Chloroplast DNAs in most land plants share a highly conserved gene content and similar gene orders [6]. Most cpDNAs include two identical regions in opposite orientations called the inverted repeat (IR), flanked by large single copy (LSC) and small single copy (SSC) regions. The IRs generally contain the bacterial-like rRNA gene clusters, and the genes involved in photosynthesis (photosystem I/II, cytochrome b6/f, and ATP synthase) are arranged similarly in chloroplast and cyanobacterial genomes [2,3,7]. Despite these well-characterized patterns, it is unknown to what extent the conserved gene order reflects a slow intrinsic rate of neutral chromosomal rearrangements, rather than selection against alternative gene orders. A model of neutral rearrangement of gene order is required to test formally whether gene orders evolve under selection which prefers some gene arrangements over others. Nadeau and Taylor first proposed a model for the neutral evolution of gene order in comparisons of mouse and human chromosomes [8]. This "random breakage model" provides a null hypothesis for the evolution of gene order. It assumes a random distribution of break points and allows all possible gene orders without restrictions. The random breakage model has been used to infer organismal phylogenies from gene order data [9]. The gene order difference can be measured using the inversion distance, which is the minimal number of inversions necessary to transform one gene order to another. Currently, the most accurate heuristic approach is implemented in the GRAPPA software [10], which is generally suitable for small taxon sets because the algorithm scores inversion medians for all nodes iteratively across all possible phylogenies. Algorithms for genomes with arbitrary rearrangements, a few deletions and duplications have been developed [11], and the capacity of GRAPPA can be scaled up with the discovering method (DCM) to potentially very large data sets [12]. The random breakage model does not account for recombination hotspots, which have been reported from human-mouse genome comparisons [13]. However, at this time it may be difficult to model these hotspots, because the precise locations of reused breakpoints are unknown due to insufficient resolution of gene orders and potential errors in homology assessment given the scale of eukaryotic chromosomes [14]. Thus, the fragile breakage model [13], as an alternative to the random breakage model, has not been well established. Whereas gene order is generally conserved among land plant cpDNAs, very little synteny is observed between this group and cpDNAs of the chlorophytic green algae C. reinhardtii [15,16] and Chlorella vulgaris [17]. The apparently increased rearrangement rate is associated with invasion by a large number of short dispersed repeats (SDRs), for which the evolutionary distribution is still poorly defined. The large number of rearrangements provides an excellent opportunity to test whether natural selection has preferred some changes in gene order. Here we present novel statistics and parametric tests that lead us to reject the models of random rearrangement in favor of directional selection for clustering of functionally related genes in C. reinhardtii. We also present experimental evidence that adaptive evolution of chloroplast genome structure could be driven by the advantage of concerted regulation conferred by polycistronic transcription. Results Functional clusters are not randomly distributed We compared gene orders of representative cpDNAs from land plants, including tobacco (Nicotiana tabacum, [GenBank:NC_001879]) [18] and liverwort (Marchantia polymorpha, [GenBank:NC_001319]) [19], a charophytic green alga (Chaetosphaeridium globosum [GenBank:NC_004115]) [20], chlorophytic green algae (Nephroselmis olivacea [GenBank:NC_000927] [21], C. vulgaris [GenBank:NC_001865] [17], C. reinhardtii [GenBank:BK000554] [16]), a green flagellate alga with uncertain affinities (Mesostigma viride [GenBank:NC_002186]) [22], and the plastid of Cyanophora paradoxa [GenBank:NC_001675] [23] (Figure (Figure1)1
To measure the genome structure in terms of clustering by chromosome locations and gene functions, we defined "sided blocks" as contiguous genes coded on the same strand of the plastid chromosome, and "functional clusters" as blocks of functionally related genes (see Methods). The randomness in the observed distribution of shared genes in chloroplast genomes with respect to gene function was assessed using a Kolmogorov-Smirnov test. The null hypothesis was rejected in all seven cpDNAs investigated for genes in functional categories such as ATP synthases and electron transport (p << 0.05, Table 1). While this test suggests some degree of functional clustering in all chloroplast genomes, it does not take into account the phylogenetic relationship of these organisms, so it is unclear whether functional clustering in chloroplast genomes is a legacy of genome organization in a cyanobacteria-like ancestor, or the product of selection on gene order in the face of genome rearrangements.
Extensive rearrangements from the ancestral chloroplast genome to C. reinhardtii In order to investigate evolutionary changes of gene order, we constructed a phylogeny of seven representative cpDNAs and rooted with the sequence of C. paradoxa [23]. Maximum parsimony, neighbor joining and maximum likelihood analyses of an alignment of 50 concatenated protein sequences including a total of 19,836 aligned sites (Additional file 2), all yielded identical fully resolved topologies with high bootstrap support (Figure (Figure2A).2A
We scored the orders of 85 genes shared in the seven genomes (Gene orders are in additional file 3). Then we used modified versions of GRAPPA [11,25] to compute the inversion distance between ancestral nodes and each terminal node (Figure (Figure2B;2B The cpDNAs of two land plants, N. tabacum and M. polymorpha, were separated by an estimated 7 inversions based on the data set. One large inversion (~30 kb) in the LSC region has long been recognized to separate the two genomes [26]. Additional rearrangements are directly observable through comparison of gene order files for the two species (see additional file 5 for the sequences of gene order rearrangements). Using GRAPPA, all rearrangements were inferred as inversions, but the total number of inversion events estimated by GRAPPA may be greater than the true (but unknown) mixture of inversions and transpositions because one transposition could result in the same change in gene order as two or three inversions. Increased order in the genome structure after rearrangements Two genomic structural characteristics were measured: the propensity of adjacent genes to be clustered on the same strand (using the sidedness index Cs) and the clustering of functionally related genes (using the functional cluster index, Cf) (see Methods). Both indices were calculated for the inferred ancestral gene orders and extant daughter lineages. Among land plants and charophytes, the inferred sidedness among ancestral genomes was similar to extant lineages, however, among the chlorophytes an opposite trend was observed, especially in the C. reinhardtii lineage (Additional file 3). The large number of rearrangements in the C. reinhardtii cpDNA lineage resulted in dramatically increased sidedness relative to the inferred most recent common ancestor of C. reinhardtii and C. vulgaris (Cs ancestor = 0.6966, Cs observed = 0.8710; Figure Figure3A).3A
To test the null hypothesis that the changes in Cs and Cf were consequence of random genome rearrangements rather than a consequence of directional selection (H0: random rearrangement; HA: constraints in rearrangements), we simulated random rearrangements starting with the inferred ancestral genome along the branch leading to C. reinhardtii. Although inversions are the most abundant type of rearrangement in cpDNAs [27], we also considered the contribution of transpositions under three inversion to transposition ratios, while the total number of rearrangements was fixed according to the branch length inferred using GRAPPA (Figure (Figure2B).2B Given the large number of rearrangements observed in the C. reinhardtii lineage, Cf was also predicted to decrease significantly under the random breakage model, but Cf did not decrease as observed in C. reinhardtii (Figure (Figure3B).3B The increased level of organization in C. reinhardtii cpDNA was associated with both maintenance of ancestral clusters and growth of new clusters. There were six conserved blocks containing 19 of the 85 genes shared between the C. reinhardtii and the C. vulgaris cpDNAs. These blocks include concentrations of genes from a single functional category, such as ribosomal proteins (rpl23-rpl12-rps19, rpl16-rpl14-rps8), Photosystem II (psbL-psbF, psbB-psbT-psbN-psbH), translation apparatus (rrn16-trnI-GAU – trnA-UGC – rrn23-rrn5), and ATP synthase subunits (atpF-atpH). Moreover, a number of small clusters of functionally related genes inferred in the ancestral genome were brought together in C. reinhardtii ("rearranged clusters" in Figure Figure4B).4B
Coordinated expression of genes in functional clusters Co-transcription of several clusters shown in Figure Figure4B4B Discussion By reconstructing the possible ancestral gene order in chloroplast genomes and simulating rearrangements, we have been able to formally test and reject the null hypothesis that C. reinhardtii cpDNA has evolved through random rearrangements. Instead, we found that its observed gene order deviates strongly from the degree of sidedness and clustering expected under a random breakage model. Euglena gracilis cpDNA also has a high degree of sidedness [33], however, the asymmetry of its coding strand is concentrated in one half of the genome and associated with GC content, which could be influenced by asymmetrical replication of the chromosome [33]. In C. reinhardtii, the sidedness is not associated with GC content and we hypothesize that it is driven by co-transcription of genes in a functional cluster. Whereas some clusters of co-transcribed genes (e.g. rpl23-rpl2-rps19, rpl16-rpl14-rps8) were maintained in both C. reinhardtii and C. vulgaris, novel clusters clearly formed in the C. reinhardtii lineage (Figure (Figure4B4B Co-transcription of neighboring genes in the C. reinhardtii chloroplast is a widely documented phenomenon. We demonstrated that in addition to the ribosomal protein clusters, global analyses support the elevated level of clustering of other functionally related genes. The aggregate of genes in clusters include most essential genes involved in translation and transcription, and some photosynthetic genes. Coordinated transcription may play a crucial role in the regulation of plastid gene expression in response to light or circadian rhythms [34,35]. It is also possible that some clusters contain cis-elements, similar to the artificial polydeoxyadenosine sequences [36], which enhance transcription efficiency. Moreover, most of the putative co-transcription units are not conserved across chlorophytes. Therefore, the majority of functional clusters observed in C. reinhardtii represent new gene arrangements. In the chloroplast gene order phylogeny (Figure (Figure2B),2B Gene order changes reflect relatively rare evolutionary events and are expected to result in much less homoplasy than substitution events in nucleotide or protein sequences over a deep time scale [44]. Phylogeny reconstruction using GRAPPA is highly accurate even for divergent genomes [45], and thus the ancestral gene orders inferred in our study contained sufficient phylogenetic information. The only other software for genome rearrangement phylogeny, BADGER [46], performed poorly on this data set (results not shown). GRAPPA usually inferred unique ancestral gene orders on many data sets we tested. Furthermore, analyses on simulated data have shown that the inferred gene orders scored almost as well as true ancestral gene orders [47]. In our simulation tests of three genomes with 85 genes each, and branch lengths of 50, 20 and 20 (roughly corresponding to the branches leading to C. reinhardtii, C. vulgaris and N. olivacea; see Methods), the average score for ancestral gene orders computed by GRAPPA was only about 7% less than the true scores. In practice, we observed that the less optimal gene orders generally required more rearrangements. Therefore, it is quite likely that any error in our estimation of ancestral gene order has resulted in a downward bias in the inferred number of rearrangements on the branch leading to C. reinhardtii. Increasing the number of rearrangements on this branch would only lead to a more certain rejection of the neutrality of rearrangements. The accuracy of ancestral genome reconstruction also depends on the degree of divergence among extant taxa and taxon sampling. For example, accurate reconstruction of ancestral genomes at the mammalian CFTR locus was achieved at the DNA level [48]. The high-quality reconstruction was attributed to a dense sampling of syntenic genome sequences from eutherian mammals, and the lack of gene order rearrangement at the locus. Because the C. reinhardtii cpDNA is one of the most rearranged chloroplast genomes sequenced to date, we included all available chlorophyte chloroplast genomes for evolutionary distance estimation and ancestral gene order reconstruction. The accuracy of our ancestral gene order estimation may improve with inclusion of additional chlorophyte plastid gene orders as they become available, but we do not foresee a substantial reduction in the inferred number of rearrangements separating C. reinhardtii and C. vulgaris from their common ancestor. Inversions are thought to be much more common than transpositions in chloroplast genome evolution [27], and our estimation of ancestral genome order was made with the assumption that all rearrangements were inversions. However, we did consider the contribution of inversions and transpositions under different scenarios in the simulation from the ancestral genome. It should be noted that there is not a unique phylogeny distance measure using transposition only, because computationally one transposition is equivalent to two or three inversions [49]. For this reason, we designed our simulations to allow for various ratios of inversion and transposition events. The results of our simulation study did not vary significantly under these scenarios. The GRAPPA-IR algorithm was developed to account for the inverted repeat (IR) region found in most plastid genomes The IR region seems to evolve at a slower rate in both nucleotide sequence and gene order than the single copy regions [50], and frequent intra-molecular recombination homogenizes the two copies [6,51]. The most conserved gene set in the IR region is the rRNA operon. In IR-containing green plastids, the order of rRNA genes is conserved, but the IR boundaries can vary greatly even within one genus [52]. The IR may restrict rearrangements that cross the boundary of single copy regions, and thus concentrate gene order changes within single copy regions. However, this hypothetical constraint of the IR on genome rearrangements seems to have been lost in the C. reinhardtii/C. vulgaris lineage. Notably, both lineages have undergone extensive rearrangements since their divergence from a common ancestor, and they contain only a few conserved clusters encoding rRNA or ribosomal proteins. In either genome, genes that typically reside together in the LSC region have often been scrambled and scattered. When comparing the ancestral genome to the C. vulgaris gene order, there was no distinction of LSC and SSC regions although many large clusters were still shared (additional file 4). If there were constraints on the breakpoint locations, as experimentally identified in bacterial inversion mutants [53], it would limit the possible paths of evolution, and these constraints on the ancestral gene orders would increase the number of rearrangements relative to the estimations derived from GRAPPA. Therefore, as discussed above, our approach of detecting strong deviation from expectation is conservative in that the number of rearrangements may be underestimated. Recent studies of plant, animal and fungal genomes have shown that genes involved in the same pathways or genes sharing similar expression patterns are often spatially clustered [1,5,54]. In eukaryotes, the operon structure has only been demonstrated in the nematode Caenorhabditis [55]. Comparative analyses of yeast genomes indicate that rearrangements brought together duplicate genes forming the DAL cluster involved in allantoin metabolism [56]. In this study, we demonstrated that positive selection for increased clustering has influenced gene order in the chloroplast. Gene clusters, as opposed to separated genes, permit polycistronic transcription and thus fewer transcriptional regulation units. Co-transcription may be facilitated by close spacing of genes in cpDNA because transcription termination is inefficient [57]. Although post-transcriptional RNA processing often creates multiple single-gene transcripts, co-transcription foments an initial stoichiometric accumulation of RNA corresponding to each gene in a cluster. Thus, large clusters can be advantageous in coordinating gene expression on this level. Experimental approaches are necessary to understand whether these gene clusters function as operons. Because chloroplast primary transcripts are heavily processed – as just one example, the psbB cluster in maize accumulates as at least 15 distinct mRNA species with varying translational capacities [58] – direct analysis of the functional advantages of clustering in chloroplasts is challenging. Indeed, Chlamydomonas may be a special case, since unlike land plants it has a single rather than multiple RNA polymerases [35]. This situation does not allow differential expression by promoter selectivity, and may therefore serve as a selective force that favors physical grouping of genes rather than evolution of promoter sequences of dispersed genes. Conclusion In conclusion, we infer that gene order in the C. reinhardtii plastid evolved in a non-random fashion, and hypothesize that genome structure has been influenced by directional selection acting on variation generated by an increased rate of rearrangement. Our results provide strong evidence that genetic responses to natural selection occur at the level of genome organization. By estimating the ancestral gene order and simulating rearrangements under a null model, we provide a formal demonstration that the chloroplast genome of C. reinhardtii has been shaped by natural selection. Although the model of natural selection on gene order remains to be developed, application of our methods to sequences of additional chlorophyte plastid genomes would help to improve the accuracy of the ancestral genome reconstruction and inferred branch lengths. The complex process of gene duplication and loss in bacterial and eukaryotic nuclear genomes presents challenges to reconstruction of ancestral gene orders. Still, the development of new comparative tools [59] gives us hope that the type of analysis presented in this paper will soon be applicable to eukaryotic genomes. Methods Functional clustering of chloroplast genes We defined a "functional cluster" as contiguous genes encoded on one strand from one of the following categories: transcription/translation, photosystem I and II, electron transport (cytochrome b6/f complex), and ATP synthase (See additional file 1). Kolmogorov-Smirnov test of random clusters A random cluster consists of genes from any functional category. The n = 85 genes shared in the seven chloroplast genomes shown in Figure Figure11 Phylogeny of chloroplast genomes Alignments of 50 proteins shared in the 8 chloroplast genomes shown in Figure Figure2A2A Inferring ancestral gene orders The ancestral gene order was inferred from the gene orders of extant genomes on the best-scored tree following two steps. First, the gene contents for the LSC, SSC and IR regions of ancestral genomes of IR-containing cpDNAs were inferred based on parsimony. Changes in gene copy number due to IR expansion or contraction were considered the last step of gene order changes, and thus the gene contents of ancestral genomes were determined. The ancestral gene orders on the phylogeny for five genomes (excluding C. vulgaris and C. reinhardtii) were computed using GRAPPA-IR [25], which is a modified version of GRAPPA that scores rearrangements independently within LSC, SSC or IR. Second, the chlorophyte algal gene orders (the extant chloroplast gene orders of N. olivacea, C. reinhardtii and C. vulgaris and the inferred ancestral genome of N. olivacea from step one) and the gene order of M. viride were used for the inference of the common ancestral gene order of C. vulgaris and C. reinhardtii. The data set contains duplicated trnV-UAC and trnG-GCC in C. vulgaris, trnE-UUC and psbA in C. reinhardtii and three trans-splicing psaA exons in C. reinhardtii. The IR regions contained rRNA genes in the same order and orientation in each genome except that one copy was lost in the lineage leading to C. vulgaris. To score the genomes with gene duplications and deletions, multiple data sets were created each containing genomes with equalized gene contents by the following assignment rules: one copy of each duplicate genes outside the typical IR was chosen; the IR region lost in C. vulgaris was inserted to all possible locations in that genome. Preferably, we should test all these datasets (3,936 total) with inversion medians; however, such computation on one dataset alone will take more than a month. To overcome this limitation, these datasets were computed using breakpoint medians, and the assignment yielded the shortest tree was chosen for a full evaluation by GRAPPA. Because the gene contents of LSC and SSC in C. reinhardtii were different from other chloroplast genomes in the study, we allows free rearrangements such that genes in LSC or SSC could commute across the IR. Ancestral gene order simulation A set of simulation experiments were conducted to evaluate the accuracy of ancestral genome reconstruction with long branches. Three genomes with 85 genes each were generated from a defined ancestral gene order, and the branch lengths (inversion distances) were 50, 20 and 20, respectively. The true gene order score was 90 (equals the tree length). The scores were computed for inferred ancestral gene orders by GRAPPA using inversion medians and the random breakage model and then compared to the true score. The experiment was repeated on 30 data sets. Random genome rearrangement simulation Gene orders were simulated under the assumption that the rearrangements involve random breakpoints placed between genes. Initial gene orders were set based on the inferred ancestral gene orders estimated. Random rearrangement operations on the initial genomes were performed for the number of replicates according to the number of rearrangements inferred by GRAPPA. The parameters input to the model were the ratios of inversion and transposition (1:0, 10:1, 1:1) to test the sensitivity of the findings to the specific rearrangement model. The simulated genomes had identical gene content but scrambled gene orders relative to those observed in extant genomes, with the exception that inverted repeats were maintained. Test statistics (below) were calculated for each simulated replicate of 10,000 total and the frequency distributions were used to test the null hypothesis of random rearrangement. Sidedness index (Cs) We designed the sidedness index (Cs) to measure the degree to which neighboring genes are clustered on the same strand (side) of the chromosome. A "sided block" includes only adjacent genes on one strand, and the number of sided blocks in a genome is designated as nSB, while the total number of genes is n. Cs is defined as Cs = (n-nSB)/(n-1). When Cs reaches the maximum of 1, all genes are located on one side. If every gene resides on the strand opposite its neighbors, Cs approaches a minimum of zero. Functional cluster index (Cf) We divided a genome of total n genes to J sided blocks (r1, r2,...rJ). In a block, we assigned genes to functional categories. Let the numbers of genes in the ith functional category and the jth block be mij, the functional cluster index Cf is A larger value of Cf indicates that functionally related genes are more clustered into blocks. RNA analysis Wild-type CC-124 cells were grown in Tris-Acetate-Phosphate medium [60] under continuous light to mid-log phase. RNA was isolated from 10 mL of cells as previously described [61]. For filter hybridization, 5 μg of total RNA was fractionated in 1.2% agarose and 6% formaldehyde gels, transferred to nylon membranes, and probed with gene-specific PCR products labeled by random priming according to Church and Gilbert [62]. List of Abbreviations cpDNA, chloroplast DNA; IR, inverted repeat; SDR, short dispersed repeat Authors' contributions LC conducted the analysis and drafted the manuscript. JLM and CWD conceived the study, helped with the analyses and contributed to the text. LSW contributed the code for the genome simulator. JT carried out the ancestral genome reconstruction. LR conducted the RNA analysis. DBS provided further experimental data review and revision of the draft. All authors read and approved the final manuscript. Additional File 1 Gene coding and functional categories. Text file, lists the names of 85 genes included in the study and corresponding functional categories. Click here for file(3.0K, txt) Additional File 3 The gene order data set. Text file, contains gene orders of seven chloroplast genomes, computed Cs and Cf indices, and the inferred rearrangement phylogeny. Click here for file(6.1K, txt) Additional File 2 Protein alignment matrix. Text file, with a NEXUS format data matrix of concatenated proteins from seven chloroplast genomes and the outgroup, Cyanophora paradoxa. Click here for file(169K, txt) Additional File 4 Comparison of gene clusters. Text file, shows gene clusters shared between the inferred ancestral genome of C. reinhardtii and C. vulgaris to the cpDNA of C. vulgaris and N. olivacea. Click here for file(1.5K, txt) Additional File 5 Inversions separating N. tabacum and M. polymorpha cpDNA. Text file, shows the possible scenarios to transform the chloroplast gene order of N. tabacum to M. polymorpha cpDNA through inversions. Click here for file(1.4K, txt) Acknowledgements We thank A. Jarosz, H. Ma, J. Marden, W. Martin, W. Miller, and D. Schemske for valuable suggestions and comments. This work was supported by NSF awards DBI 0115684 and DEB 0120709 to CWD. JT is supported by the University of South Carolina and part of the work was done while he was visiting the National Evolutionary Synthesis Center. Chlamydomonas genomics work at BTI was supported by NSF awards MCB 9975765 and MCB 0091020 to DBS. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Proc Natl Acad Sci U S A. 1988 Aug; 85(16):5794-8.
[Proc Natl Acad Sci U S A. 1988]Trends Genet. 1999 Sep; 15(9):344-7.
[Trends Genet. 1999]Nat Genet. 2002 Jun; 31(2):180-3.
[Nat Genet. 2002]Genome Res. 2003 May; 13(5):875-82.
[Genome Res. 2003]Nat Genet. 2002 Jun; 31(2):180-3.
[Nat Genet. 2002]Genome Res. 2003 May; 13(5):875-82.
[Genome Res. 2003]Proc Natl Acad Sci U S A. 1988 Aug; 85(16):5794-8.
[Proc Natl Acad Sci U S A. 1988]Trends Genet. 1999 Sep; 15(9):344-7.
[Trends Genet. 1999]J Mol Biol. 1992 Mar 20; 224(2):529-36.
[J Mol Biol. 1992]Proc Natl Acad Sci U S A. 1984 Feb; 81(3):814-8.
[Proc Natl Acad Sci U S A. 1984]Proc Int Conf Intell Syst Mol Biol. 2000; 8():104-15.
[Proc Int Conf Intell Syst Mol Biol. 2000]Bioinformatics. 2001; 17 Suppl 1():S165-73.
[Bioinformatics. 2001]Bioinformatics. 2003; 19 Suppl 1():i305-12.
[Bioinformatics. 2003]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7672-7.
[Proc Natl Acad Sci U S A. 2003]Bioinformatics. 2004 Aug 4; 20 Suppl 1():i318-25.
[Bioinformatics. 2004]Mol Biol Evol. 1996 Jan; 13(1):233-43.
[Mol Biol Evol. 1996]Plant Cell. 2002 Nov; 14(11):2659-79.
[Plant Cell. 2002]Proc Natl Acad Sci U S A. 1997 May 27; 94(11):5967-72.
[Proc Natl Acad Sci U S A. 1997]EMBO J. 1986 Sep; 5(9):2043-2049.
[EMBO J. 1986]J Mol Biol. 1988 Sep 20; 203(2):281-98.
[J Mol Biol. 1988]Proc Natl Acad Sci U S A. 2002 Aug 20; 99(17):11275-80.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 1999 Aug 31; 96(18):10248-53.
[Proc Natl Acad Sci U S A. 1999]Proc Natl Acad Sci U S A. 1997 May 27; 94(11):5967-72.
[Proc Natl Acad Sci U S A. 1997]Science. 2001 Dec 14; 294(5550):2351-3.
[Science. 2001]Nature. 2000 Feb 10; 403(6770):649-52.
[Nature. 2000]Trends Genet. 1990 Apr; 6(4):115-20.
[Trends Genet. 1990]Plant Mol Biol. 1995 Jan; 27(2):351-64.
[Plant Mol Biol. 1995]Cell. 1988 Mar 25; 52(6):903-13.
[Cell. 1988]Mol Gen Genet. 1995 Mar 10; 246(5):600-4.
[Mol Gen Genet. 1995]Mol Cell Biol. 1994 Sep; 14(9):6171-9.
[Mol Cell Biol. 1994]Plant J. 1998 Jun; 14(6):663-71.
[Plant J. 1998]Nucleic Acids Res. 2003 May 1; 31(9):2417-23.
[Nucleic Acids Res. 2003]Proc Natl Acad Sci U S A. 1999 Apr 27; 96(9):5123-8.
[Proc Natl Acad Sci U S A. 1999]Nucleic Acids Res. 1990 May 11; 18(9):2625-31.
[Nucleic Acids Res. 1990]Plant J. 2002 Jul; 31(2):149-60.
[Plant J. 2002]Planta. 2001 Apr; 212(5-6):851-7.
[Planta. 2001]Plant Cell. 2002 Nov; 14(11):2659-79.
[Plant Cell. 2002]Mol Microbiol. 1998 Mar; 27(6):1091-8.
[Mol Microbiol. 1998]Mol Biol Evol. 1996 Jan; 13(1):233-43.
[Mol Biol Evol. 1996]Biosystems. 1985; 18(3-4):293-8.
[Biosystems. 1985]Plant Mol Biol. 1994 Feb; 24(4):585-602.
[Plant Mol Biol. 1994]Trends Ecol Evol. 2000 Nov 1; 15(11):454-459.
[Trends Ecol Evol. 2000]Pac Symp Biocomput. 2002; ():524-35.
[Pac Symp Biocomput. 2002]Genome Res. 2004 Dec; 14(12):2412-23.
[Genome Res. 2004]Plant Mol Biol. 1995 Jan; 27(2):351-64.
[Plant Mol Biol. 1995]Proc Natl Acad Sci U S A. 1987 Dec; 84(24):9054-8.
[Proc Natl Acad Sci U S A. 1987]Mol Gen Genet. 1996 Aug 27; 252(1-2):195-206.
[Mol Gen Genet. 1996]Genetics. 1989 Aug; 122(4):737-47.
[Genetics. 1989]Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Genome Res. 2003 May; 13(5):875-82.
[Genome Res. 2003]Genome Res. 2004 Jun; 14(6):1060-7.
[Genome Res. 2004]Nature. 2002 Jun 20; 417(6891):851-4.
[Nature. 2002]Nat Genet. 2005 Jul; 37(7):777-82.
[Nat Genet. 2005]EMBO J. 1988 Sep; 7(9):2637-44.
[EMBO J. 1988]Plant J. 2002 Jul; 31(2):149-60.
[Plant J. 2002]Proc Natl Acad Sci U S A. 2003 Sep 30; 100(20):11484-9.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7672-7.
[Proc Natl Acad Sci U S A. 2003]Plant J. 1999 Sep; 19(5):521-31.
[Plant J. 1999]Proc Natl Acad Sci U S A. 1984 Apr; 81(7):1991-5.
[Proc Natl Acad Sci U S A. 1984]