• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of plntcellLink to Publisher's site
Plant Cell. Jul 2009; 21(7): 1912–1928.
PMCID: PMC2729604

Comparative Analysis between Homoeologous Genome Segments of Brassica napus and Its Progenitor Species Reveals Extensive Sequence-Level Divergence[W][OA]

Abstract

Homoeologous regions of Brassica genomes were analyzed at the sequence level. These represent segments of the Brassica A genome as found in Brassica rapa and Brassica napus and the corresponding segments of the Brassica C genome as found in Brassica oleracea and B. napus. Analysis of synonymous base substitution rates within modeled genes revealed a relatively broad range of times (0.12 to 1.37 million years ago) since the divergence of orthologous genome segments as represented in B. napus and the diploid species. Similar, and consistent, ranges were also identified for single nucleotide polymorphism and insertion-deletion variation. Genes conserved across the Brassica genomes and the homoeologous segments of the genome of Arabidopsis thaliana showed almost perfect collinearity. Numerous examples of apparent transduplication of gene fragments, as previously reported in B. oleracea, were observed in B. rapa and B. napus, indicating that this phenomenon is widespread in Brassica species. In the majority of the regions studied, the C genome segments were expanded in size relative to their A genome counterparts. The considerable variation that we observed, even between the different versions of the same Brassica genome, for gene fragments and annotated putative genes suggest that the concept of the pan-genome might be particularly appropriate when considering Brassica genomes.

INTRODUCTION

Polyploidy is widespread in angiosperms and is thought to have been a predominant factor in the evolution and success of these species (Leitch and Bennett, 1997; Wendel, 2000). Understanding the mechanisms involved in the structural and functional evolution of genomes during the process of diploidization following polyploidy is of major importance to plant biology. The availability of the complete genome sequence for Arabidopsis thaliana (Arabidopsis Genome Initiative, 2000) has enabled the outcomes of the diploidization process to be analyzed not only at the sequence level directly within the genome of Arabidopsis by the identification of related genome segments (Blanc et al., 2000; Paterson et al., 2000), but also in relation to sequences from distantly related species, including tomato (Solanum lycopersicum; Ku et al., 2000) and rice (Oryza sativa; Mayer et al., 2001). However, studies involving very ancient genome duplication and speciation events, such as those represented in Arabidopsis, the most recent of which, termed the alpha genome duplication (Bowers et al., 2003), give little insight into the mechanisms involved.

The cultivated Brassica species are the group of crops most closely related to Arabidopsis, all of which are members of the Brassiceae tribe within the Brassicaceae family (Warwick and Black, 1991). In contrast with tomato and rice, the lineages of which diverged from that of Arabidopsis ~150 and 200 million years ago (Mya), respectively (Yang et al., 1999; Wolfe et al., 1989), the Brassica and Arabidopsis lineages diverged only ~20 Mya (Yang et al., 1999). The lineages of the species Brassica rapa and Brassica oleracea, which contain the Brassica A and C genomes, respectively, have been estimated to have diverged ~3.7 Mya (Inaba and Nishio, 2002). Brassica napus is an allopolyploid, arising from the hybridization of A and C genome progenitors (U, 1935), probably during human cultivation (i.e., <10,000 years ago). Genetic mapping confirmed that the progenitor A and C genomes are essentially intact in B. napus and have not been rearranged (Parkin et al., 1995). Therefore, the Brassica species provide an opportunity to study the evolution of genome structure over a wide range of timescales. However, representatives of the precise ancestors of natural B. napus have yet to be identified, and the breeding of rapeseed is likely to have included crosses that could have introduced into the oilseed rape germplasm allelic variation from additional sources, such as B. rapa (Qiu et al., 2006).

Comparative studies conducted at the level of genetic linkage maps revealed extensive duplication within Brassica genomes (Kowalski et al., 1994; Lagercrantz and Lydiate, 1996), and segmental relationships were identified indicative of a mixture of single, duplicated, and triplicated genome segments relative to Arabidopsis (Lan et al., 2000; Schmidt et al., 2001; Babula et al., 2003; Lukens et al., 2003; Parkin et al., 2003). More recently, it was determined using a cytogenetic approach that a distinctive feature of the Brassiceae tribe is that they contain extensively triplicated genomes (Lysak et al., 2005). Around the same time, a study based upon linkage mapping using sequenced restriction fragment length polymorphism markers demonstrated that 21 segments of the genome of Arabidopsis, representing almost its entirety, could be replicated and rearranged to generate the structure of the B. napus genome (Parkin et al., 2005). The majority of the Arabidopsis genome (11 segments) could each be aligned to six segments of the B. napus genome, indicative of triplication in the genomes of both progenitor species. However, there were numerous examples of segments having been detected in less than six copies, and some examples of more then six segments having been identified. A broader study across the Brassicaceae has identified 24 conserved chromosomal blocks, relating them to a proposed ancestral karyotype (n = 8) (Schranz et al., 2006). Although the most likely explanation for the structure of the Brassica genomes is paleohexaploidy followed by segmental loss and limited segmental duplication, other explanations are possible (Lukens et al., 2004), including paleotetraploidy followed by more extensive segmental duplication. Where analyses have been conducted on targeted regions of the genomes of B. oleracea, B. rapa, and B. napus using physical mapping techniques, the results have been consistent with the fundamentally triplicated nature of the diploid Brassica genomes (O'Neill and Bancroft, 2000; Park et al., 2005; Rana et al., 2004).

Two sequence-level studies, one in B. oleracea (Town et al., 2006) and one in B. rapa (Yang et al., 2006), have clarified aspects of genome evolution and organization in Brassica by taking a comparative approach using homoeologous regions of the genome of Arabidopsis. Between the studies, 11 Brassica genome segments were analyzed, totaling ~2.8 Mb of contiguous sequences. The overall mean synonymous base substitution rate between Brassica genes and their Arabidopsis orthologs can be calculated as 0.51 (with a range for the individual segments of 0.46 to 0.58). Using the commonly adopted estimate of mutational rate of 1.5 × 10−8 synonymous substitutions per site per year (Koch et al., 2000), the estimate of the time that the Arabidopsis and Brassica lineages diverged can be refined to ~17.0 Mya. The overall mean synonymous base substitution rate between genes in the related sets of Brassica genome segments can be calculated as 0.43 (with a range for each pair of segments of 0.36 to 0.57), allowing refinement of the estimate of the time that the replicated Brassica subgenomes diverged to ~14.3 Mya.

The B. rapa study (Yang et al., 2006) characterized an additional segmental duplication, which occurred ~0.8 Mya, resulting in the presence of four homoeologous genome segments in B. rapa for one segment in Arabidopsis. It is likely that such events, along with the segmental deletions, will account for the observed variances from genome triplication that have been observed (Parkin et al., 2005). In the larger study, of ~2.2 Mb of B. oleracea sequences (Town et al., 2006), sequence annotation identified 177 genes in the B. oleracea genome segments that were in perfectly conserved collinear order with their orthologs in Arabidopsis. However, using Arabidopsis as an outgroup, it was shown that 35% of the genes inferred to be present when genome triplication occurred in the Brassica lineage have been lost, most likely via a deletion mechanism, in an interspersed pattern. In addition, evidence for the frequent insertion of gene fragments of nuclear genomic origin was identified, along with four examples of apparently intact genes in noncollinear positions in the B. oleracea and Arabidopsis genomes.

Brassica polyploids can be synthesized artificially. For example, B. napus can be resynthesized by hybridization of B. rapa and B. oleracea. Such lines initially display genome instability that has been shown to persist for at least five generations of self-pollination, leading to genetic changes, in all lines studied (Gaeta et al., 2007) and has been interpreted as indicating that a high rate of genome evolution occurs in polyploids (Song et al., 1995). These genetic changes are thought to be homoeologous nonreciprocal transpositions and were correlated with qualitative changes in the expression of specific genes and with phenotypic variation (Gaeta et al., 2007). By contrast, only a small number of homoeologous recombination events have been observed in oilseed rape (B. napus) cultivars (Parkin et al., 1995; Sharpe et al., 1995; Udall et al., 2004). When compared with its progenitor species at the level of genome microstructure, using hybridization-based physical mapping approaches, natural B. napus appears to show relatively little change in gene content and order (Rana et al., 2004). One explanation for the difference between resynthesized and natural B. napus is that natural B. napus may have evolved or inherited a locus controlling homoeologous recombination (Jenczewski et al., 2003).

The complexity of Brassica genome structure caused by multiple rounds of duplication (either segmental or the result of polyploidy), along with chromosomal-scale rearrangements and gene-level deletions, causes immense difficulty for attempts to understand the evolutionary timescales by the analysis of sequences of individual genes. To our knowledge, there have been no sequence-level analyses reported across complete sets of homoeologous segments of the genomes of a polyploid Brassica (such as B. napus) and representatives of its ancestral diploid progenitor species (such as B. rapa and B. oleracea). To fill this knowledge gap, understand more of the evolutionary processes shaping the structure of polyploid genomes over relatively short timescales, and perhaps to begin reconciling the results from natural and resynthesized B. napus, we undertook such a study. We focused on a set of related Brassica genome segments that had already been characterized by BAC-based physical mapping (Rana et al., 2004). These represent six sets of homoeologous genome segments as described in previous studies in B. oleracea (O'Neill and Bancroft, 2000; Town et al., 2006) and an almost complete set of the homoeologous regions of the genomes of B. rapa and B. napus. In the case of B. napus, BAC clones representing both the A genome (from a B. rapa progenitor) and the C genome (from a B. oleracea progenitor) homoeologs were studied.

RESULTS

Generation of Sequence Contigs

BACs were selected for sequencing defined regions of the genomes of B. rapa and B. napus on the basis of previous physical mapping analyses (Rana et al., 2004; Park et al., 2005), with substitution of the B. napus Contig A BACs listed in Rana et al. (2004) for clones identified on the basis of BAC end sequence alignments. The clones represent all or overlapping parts of the BAC contigs assembled (O'Neill and Bancroft, 2000) and sequenced (Town et al., 2006) initially for B. oleracea ssp alboglabra A12DH. The BAC clones that were sequenced as part of this study, and their assignment to contigs and (for B. napus clones) genome, are shown in Table 1. These include KBrH138O03, which contains sequences from B. rapa ssp pekinensis Chiifu. This clone had been sequenced as part of the Brassica rapa Genome Sequencing project and overlaps part of the Contig E region. BACs were generally sequenced to GenBank phase 3 finished standards, although there were several intergenic regions that could not be completely sequenced and two physical gaps in BAC JBr037K23 (representing the B. rapa Contig F region). After the completion of sequencing, the BACs from B. napus C genome Contig E were assembled to produce a single sequence assembly. The lengths of the resulting sequences are summarized in Table 1, along with GenBank accession numbers.

Table 1.
BAC Clones Sequenced

Annotation of Sequence Contigs

Gene prediction was conducted using genemarkHMM (Lukashin and Borodovsky, 1998) with limited manual curation to resolve inconsistencies between paralogs. The B. rapa A genome and B. napus A genome and C genome sequences were newly derived. The B. oleracea C genome contigs were sequenced and published previously (Town et al., 2006). However, recent changes to genemarkHMM necessitated reannotation of this contig to make comparisons of gene counts and other features comparable across the genome segments. Therefore, this analysis should be considered as including a new annotation of the preexisting B. oleracea sequences. The results are summarized in Table 2.

Table 2.
Summary of Annotated Features in Sequence Contigs Representing Sets of Homoeologous Genome Segments

After accounting for differences in the lengths of the sequenced segments, the relative densities of genes with functional annotations and those related to transposons differ both between homoeologous segments and between the A and C genomes. The highest densities of predicted genes with functional annotation, >20 per 100 kb, are in B. rapa Contigs B, C, and D and B. napus A genome Contigs C and D. The lowest densities, <10 per 100 kb, are in B. oleracea Contigs E and F and B. napus C genome Contigs E and F. The gene density is generally higher in the A genome, with mean values of ~21 genes per 100 kb for B. rapa and ~18 genes per 100 kb for B. napus A genome, than in the C genome, which has mean values of ~12 genes per 100 kb for B. oleracea and 14 genes per 100 kb for B. napus C genome.

The density of transposon-related gene predictions shows the opposite trend, with generally higher density in the C genome, with mean values of ~9.5 transposon-related gene predictions per 100 kb for B. oleracea and 8.2 transposon-related gene predictions per 100 kb for B. napus C genome, compared with mean values of ~2.5 transposon-related gene predictions per 100 kb for B. rapa and ~2.7 transposon-related gene predictions per 100 kb for B. napus A genome. The contrast in transposon-related content between genomes is greatest in Contigs C and E, which collectively contain only one transposon-related gene prediction across the four A genome segments, but 143 transposon-related gene predictions across the four C genome segments. These results are consistent with the relative expansion of genome regions being principally a consequence of the insertion of transposable elements rather than tandem or segmental duplications of genes or gene-containing sequences.

Overall Alignment of Homoeologous Genome Segments

We first compared the overall similarity of each of the homoeologous regions of the A and C genomes at the nucleotide level using MUMmer (Kurtz et al., 2004). The results for Contig E are shown, by way of an example, in Figures 1 to 44.. The results for the remaining contigs are shown in Supplemental Figures 1 to 17 online. For each set of genome segments, the A genomes of B. rapa and B. napus (e.g. Figure 1) show a high degree of similarity along their entire length, as do the C genomes of B. oleracea and B. napus (e.g. Figure 2). In Contig E, there is one inversion, at the end of the A genome segments. In addition to annotated genes, the collinearity includes both intergenic regions and transposons and is punctuated by numerous small insertion/deletion (InDel) events. By contrast, comparisons between the A and C genomes, from either the two diploids (B. rapa and B. oleracea) (e.g. Figure 3) or the allotetraploid B. napus (e.g. Figure 4) showed more fragmented collinearity. This is due primarily to transposon insertions in the C genomes relative to the A genomes and is most pronounced in Contigs C and E.

Figure 1.
Alignment of the Homoeologous Regions of B. rapa (JCVI ID = 97) versus B. napus A Genome (JCVI ID = 96) as Found in Contig E Using MUMmer.
Figure 2.
Alignment of the Homoeologous Regions of B. oleracea (JCVI ID = 120) versus B. napus C Genome (JCVI ID = 98) as Found in Contig E Using MUMmer.
Figure 3.
Alignment of the Homoeologous Regions of B. rapa (JCVI ID = 97) versus B. oleracea (JCVI ID = 120) as Found in Contig E Using MUMmer.
Figure 4.
Alignment of the Homoeologous Regions of B. napus A Genome (JCVI ID = 96) versus B. napus C Genome (JCVI ID = 98) as Found in Contig E Using MUMmer.

Detailed Alignment of Sequence Annotations

VISTA plots (Frazer et al., 2004) provide highly informative visualizations of the similarities and differences between homoeologous chromosome regions. A query sequence is compared with a reference sequence and the annotation on that reference sequence. In the resulting plot, the nucleotide coordinates and annotation are those of the reference sequence, with the y axis showing percentage of identity between the two sequences (computed in 100-bp windows). Different colors are used to draw attention to conservation in exons and in noncoding sequences. Gaps reflect sequence that is present in the reference sequence but absent from the query sequence. For each set of homoeologous regions, VISTA plots were generated for reciprocal analyses of the two versions of the Brassica A genome and for the two versions of the Brassica C genome, as shown in Supplemental Figures 18 to 39 online. These reveal extensive sequence conservation between both coding and noncoding regions, as would be expected for such closely related genomes. They also show that there is extensive variation by InDel events throughout all of the contigs.

Comparative Genome Analysis

We compared the genome segments with each other and with the corresponding region of the genome of Arabidopsis on the basis of their gene annotations. To identify sets of orthologous genes, each set of predicted proteins was searched against the Arabidopsis proteome. The results are summarized in Figure 5 and show that there is extensive conservation of both gene order and gene content across each set of related genome segments. The phylogenies of the protein families match those expected from the assignment of genomic regions to Contigs A to F. An example is shown in Figure 6. There were, however, several instances of genes being modeled for some, but not all, members of a set of segments related across genomes. To assess whether or not there were related sequences in these regions lacking an expected gene model, we analyzed all of the sequence contigs by deconstructing them to 1000-bp overlapping segments and used BLASTN to identify sequence similarity to the coding regions of genes annotated in Arabidopsis, as described previously (Town et al., 2006). The results of the analyses, for B. rapa and B. napus sequences, are available in Supplemental Table 1 online. The results of alignment to the homoeologous regions of the Arabidopsis genome are summarized in Supplemental Figure 40 online. This analysis revealed 21 instances (circled in Supplemental Figure 40 online) of the presence of sequences, in collinear positions, related to Arabidopsis gene models, but which had not been incorporated into gene models during the annotation process. In none of these cases could manual intervention produce intact homologous gene models. Rather, they represent instances of partial deletions of genes from particular Brassica genomic regions, leaving collinear conserved gene fragments as noted previously for B. oleracea (Town et al., 2006). None of these 21 instances, which occur across nine gene families, involve genes predicted to be involved in transcription or cellular communication/transduction. The analysis also enabled the identification in the sequences derived from B. rapa and B. napus of many interspersed gene fragments, as first described in Brassica species in B. oleracea (Town et al., 2006).

Figure 5.
Relationships between Genes Modeled in the Sets of Genome Segments.
Figure 6.
Phylogenetic Analysis of a Family of Sodium:Dicarboxylate Symporters Found on Contigs A, B, and C.

We identified one example of nontandem gene duplication. This involves, in Contig C, a second copy of a gene with high similarity to At5g47590 occurring between orthologs of At5g47600 and At5g47610. This is the only instance of disrupted collinearity with the corresponding regions of the genome of Arabidopsis. There is one example of a noncollinear (with respect to Arabidopsis) homologous sequence that is conserved across the A and C genomes: sequences with similarity to At3g43790 (annotated as a carbohydrate transporter, which has ~88% identity in exon regions) in Contig E.

Timing of Genome Divergence

Previous studies have estimated that the Brassica A and C genomes, as represented in B. rapa and B. oleracea, diverged~3.7 Mya (Inaba and Nishio, 2002). The approach used (PCR amplification from genomic DNA of specific genes for sequencing) is problematic in polyploids such as B. napus, as both homoeologs tend to coamplify. Thus, the time of divergence of the A and C genomes as represented in natural B. napus and their homoeologs in B. rapa and B. oleracea, respectively, has not been determined. We used the sequences we had obtained to estimate the timing of these events for each genomic region separately. As summarized in Tables 3 to 88,, the contigs contained varying numbers of complete sets of gene families conserved across all four homoeologous genome segments, from 3 in contig F to 15 in Contig C. In addition, 23 genes are conserved between the Contig A homoeologs (which were identified in the C genome only). Synonymous base substitution rates, Ks values, were calculated between the B. oleracea and B. rapa orthologs for each of the genes in Contigs B to F.

Table 3.
Proteins Conserved across Sets of Brassica Contig A Regions and in Arabidopsis
Table 4.
Proteins Conserved across Sets of Brassica Contig B Regions and in Arabidopsis
Table 5.
Proteins Conserved across Sets of Brassica Contig C Regions and in Arabidopsis
Table 6.
Proteins Conserved across Sets of Brassica Contig D Regions and in Arabidopsis
Table 7.
Proteins Conserved across Sets of Brassica Contig E Regions and in Arabidopsis
Table 8.
Proteins Conserved across Sets of Brassica Contig F Regions and in Arabidopsis

Using the commonly adopted estimate of mutational rate of 1.5 × 10−8 synonymous substitutions per site per year (Koch et al., 2000), the time at which the B. oleracea and B. rapa lineages diverged can be estimated. The mean values, as summarized in Table 9, are in excellent agreement with the previously estimated timing of this divergence, 3.7 Mya (Inaba and Nishio, 2002), validating the mean Ks values of these sets of genes as being an appropriate measure. We therefore used the same approach to quantify the relatedness and divergence times of the A and C genomes as represented in the diploid species (B. oleracea and B. rapa) and the allotetraploid B. napus. The results are shown in Table 9. The estimated time since divergence of the B. napus genome segments (as represented in Contigs A to F) from those representative of their progenitor species differed considerably between the regions studied. In the B. napus C genome, the most closely related segment to that in B. oleracea was Contig E, with a mean estimate of 0.12 Mya. The most distantly related was Contig D, with a mean estimate of 1.31 Mya. In the B. napus A genome, the most closely related segment to that in B. rapa was Contig F, with a mean estimate of 0.45 Mya. The most distantly related was Contig E, with a mean estimate of 1.25 Mya.

Table 9.
Divergence of Genome Segments Based on Synonymous Base Substitution Rates

Divergence of Homoeologous Genome Segments by Single Nucleotide Polymorphisms and InDels

To quantify relative expansion or contraction in length of related Brassica genome segments, we calculated the length of sequence encompassing gene models representing complete sets of conserved genes, from the beginning of the first collinear gene model, to the end of the last. The results are summarized in Table 10. Contigs C and E show considerable expansion in the C genome relative to the A genome, with the remainder of the regions being of more similar lengths in the two genomes. The A genome regions in B. napus are generally (four cases out of the five analyzed) shorter than in B. rapa, whereas the C genome in B. napus (four cases out of the six analyzed) is more frequently longer than in B. oleracea. The sequenced genome segments contain similar amounts of coding sequence, but the expanded C genome segments show a much increased content of transposon-related and noncoding sequences. The extent of sequence divergence between A and C genomes is of relevance for assessing the feasibility of developing homoeolog-specific molecular markers and to monitor homoeolog-specific gene expression in B. napus. In addition, any differences between rates of polymorphism occurrence between different fractions of the genome may be indicative of differential constraints on polymorphism generation and retention.

Table 10.
Overall Lengths of Aligned Genome Segments (bp)

We assessed the single nucleotide polymorphism (SNP) content of the coding regions of the genes conserved across sets of related genome segments. This includes both synonymous and nonsynonymous polymorphisms and provides a measure of the polymorphism specifically within the transcriptome. The results are shown in Table 11. The relative rates of polymorphism are all consistent with the relative periods of time estimated since the divergence of the genome segments (as shown in Table 9). The lowest polymorphism rate was observed between the Contig E orthologs in B. oleracea and B. napus C genome: SNPs were present at a frequency of 0.16%. The greatest polymorphism rate was observed between the Contig E orthologs in B. rapa and B. napus A genome: SNPs were present at a frequency of 1.49%.

Table 11.
SNP and InDel Rates between Aligned Genome Segments

We then assessed the overall polymorphism content of the genome segments (including the coding sequences). This provides a measure of the polymorphism within A genomic sequences and within C genomic sequences (but not between A and C). The results are shown in Table 11. The relative rates of polymorphism are broadly consistent with the polymorphism rates observed within coding sequences. For example, the least polymorphic genome segments by both analyses are the C genome Contig A and Contig E regions, whereas A genome Contig C and Contig E are among the most polymorphic.

In addition to sequence evolution by single nucleotide mutation, leading to SNPs, insertion-deletion events can also give rise to polymorphisms termed InDels. We assessed the number and sizes of InDels between the genome segments in B. napus and the representatives of the progenitor sequences, including both coding and noncoding sequences. The results are shown in Table 11. The relative rates of InDel polymorphism are consistent with the SNP polymorphism rates observed both within coding sequences and overall. For example, the lowest InDel polymorphism rates were observed between the C genome Contig A and Contig E regions, whereas A genome Contig C and Contig E are among the most polymorphic. The size distribution of the InDels detected is shown in Figure 7 for the Contig E region and Supplemental Figures 41 to 45 for the remaining regions. Although the majority of InDels are very small, mostly under 4 bp, there are numerous larger ones.

Figure 7.
Size Distribution, for the Contig E Region, of InDel Variation between B. rapa and B. napus A Genome Segments and between B. oleracea and B. napus C Genome Segments.

Comparison of the A Genome in B. napus, B. rapa ssp trilocularis, and B. rapa ssp pekinensis

Although a BAC-based physical map has been developed for B. rapa ssp trilocularis (http://brassica.bbsrc.ac.uk/IGF/), the Brassica rapa Genome Sequencing project selected a Chinese cabbage, B. rapa ssp pekinensis var Chiifu, for genome sequencing (http://brassica.bbsrc.ac.uk/brassica_genome_sequencing_concept.htm). Consequently, the extent to which the genome of B. rapa ssp pekinensis represents the A genome of B. napus is of particular importance.

A BAC library, named KBrH, has been constructed using genomic DNA of B. rapa ssp pekinensis var Chiifu (Park, et al., 2005). We sequenced a portion of clone KBrH138O03 that overlaps with the B. rapa ssp trilocularis and B. napus A genome Contig E segments that we have analyzed at the sequence level. In total, 17,653 bp could be aligned across all three genomes. Over this overlapping region, the B. rapa ssp trilocularis and B. napus A genome sequences differ at 293 bases (1.66%), the B. rapa ssp trilocularis and B. rapa ssp pekinensis sequences differ at 111 bases (0.63%), and the B. rapa ssp pekinensis and B. napus sequences differ at 316 bases (1.79%). To ensure that the three sequences are of consistently high quality, 10 regions rich in polymorphisms were resequenced. All polymorphisms were validated. Therefore, we conclude that the B. rapa genomes are substantially more closely related to each other then they are to the A genome of B. napus and that the genome of B. rapa ssp trilocularis may be slightly more representative of the A genome of B. napus than is the genome of B. rapa ssp pekinensis, at least in this region.

DISCUSSION

Conservation of Gene Order in Brassica Genomes

As had been shown previously for the B. oleracea genome segments (Town et al., 2006), the homoeologous B. rapa and both B. napus genome segments show almost perfect conservation of gene order with the homoeologous regions of the Arabidopsis genome. Breakdown of collinearity of apparently intact genes between the genomes of Arabidopsis and Brassica species has been postulated to be the consequence of transposition of intact genes (Town et al., 2006). However, as these were present in only one paralogous Brassica genome segment and only one representative of the paralog (that of B. oleracea ssp alboglabra A12DH) was analyzed, it was unclear when the putative transposition took place. We have identified an additional example of an apparently intact gene in a noncollinear position, a gene very similar (~88% nucleotide identity in exon regions) to At3g43790. This gene is in a position not covered by the B. oleracea sequence, but is present, in conserved positions, in the sequences from B. rapa and both genomes of B. napus. We have reexamined the sequences from the paralogous regions of the B. oleracea genome (Contigs D and F in O'Neill and Bancroft, 2000; Town et al., 2006), which do cover the corresponding region. The gene is not present in either. Therefore, we conclude that the most likely explanation is that the transposition of a gene with homology to At3g43790 occurred after the divergence of the lineages leading to Arabidopsis and Brassica, in only one of the paralogous ancestral Brassica genomes, but before the divergence of the Brassica A and C genome lineages.

We identified 21 instances of partial gene loss, where remnants of genes could be identified based on nucleotide sequence similarity, but which could not be included in gene models. None of these involved genes inferred to be involved in transcription or cellular communication/signal transduction, which is consistent with the hypothesis that dosage-sensitive genes are preferentially retained following genome duplication.

Chronology of Brassica Genome Divergence

Using the synonymous base substitution rates in sets of genes conserved across homoeologous genome segments, we estimated that the A and C genomes as represented in B. rapa and B. oleracea diverged between 2.57 and 4.23 Mya, in agreement with previous estimates (Inaba and Nishio, 2002). Our estimates of the timing of divergence of the B. napus genomes relative to the genomes of B. oleracea and B. rapa differed considerably between the genome regions studied, varying between mean estimates of 0.12 and 1.37 Mya. This is unlikely to be indicative of a difference in nucleotide substitution rates across different regions of the genome of B. napus. Rather, because the precise lines of B. oleracea and B. rapa that hybridized to form natural B. napus are unknown, these results more likely indicate that different parts of the genome of B. napus, as represented by European Winter oilseed rape variety Tapidor, were derived from different lines of B. rapa and B. oleracea, none of which were identical to the representatives of these species that we studied (i.e., B. oleracea ssp alboglabra A12DH and B. rapa ssp trilocularis RO18). Further analyses, based on overall sequence polymorphism rates, showed that the B. napus A genome, as represented in Contig E, may be slightly more diverged from that of B. rapa ssp pekinensis var Chiifu (which is the subject of the Brassica genome sequencing effort) than it is from B. rapa ssp trilocularis.

Genome Evolution by SNP and InDel Mechanisms

The majority of polymorphisms distinguishing the two Brassica A genomes or the two Brassica C genomes are SNPs. These vary in abundance approximately in proportion to the estimated time since divergence of the genome segments, for example, the relatively closely related C genome Contig E regions, which diverged ~0.12 Mya, show an overall genomic SNP rate of 0.47%, whereas the more distantly related A genome Contig E regions, which diverged ~1.25 Mya, show an overall genomic SNP rate of 1.73%. In addition to SNPs, the Brassica genomes differed by InDels. These occur at high frequency, on average 0.55 per kb between the relatively closely related C genome Contig E regions and 3.73 per kb between the more distantly related A genome Contig E regions. Their abundance in Brassica genomes is consistent with the previously observed ease of identification of molecular markers based on InDel differences (http://brassica.bbsrc.ac.uk/IMSORB/). In two of the regions of the genome that we studied, Contig C and Contig E, the C genome was found to be greatly expanded relative to the A genome, primarily by the insertion of transposable elements. Indeed, the size of the genome of B. oleracea is, at ~600 Mb, significantly larger than that of B. rapa, which is ~500 Mb (Arumuganthan and Earle, 1991). Therefore, this overall genome expansion may be attributable at least partly to transposon amplification in euchromatic regions.

Perspectives on Genome Evolution

The gross structure of the Brassica genomes appears to have evolved by a series of polyploidization, segmental duplication, and deletion events in varying proportions dependent upon whether a paleohexaploid or paleotetraploid ancestor was involved. Three complete sets of three related paralogous genome segments have been sequenced, two in B. oleracea (Town et al., 2006) and one in B. rapa (Yang et al., 2006). If evolution had proceeded via a paleotetraploid with subsequent segmental duplication, the extant representative genome segments in the diploid species would show evidence in the triplicated genomic regions of two distinct duplication events. In none of the three cases examined to date was this observed; rather, all three paralogs were approximately equally diverged in each case. This favors the hypothesis of a paleohexaploid ancestor. A later segmental duplication has been characterized at the sequence level and was estimated to have occurred ~0.8 Mya (Yang et al., 2006) (i.e., very much later than the hypothesized hexaploidy). Definitive proof of segmental deletions is difficult, especially for small segments when using approaches based upon molecular markers and linkage maps, as have been conducted to date (Parkin et al., 2005), but is very likely to have occurred. Small-scale deletions have been observed at the level of genome microstructure and sequence in Brassica (O'Neill and Bancroft, 2000; Rana et al., 2004; Town et al., 2006; Yang et al., 2006). Thus, B. napus represents an excellent model system in which to study the process of diploidization following polyploidy.

There is clear evidence that resynthesized B. napus shows a high rate of genome change (Song et al., 1995; Udall et al., 2004; Lukens et al., 2004), and this continues for at least five generations following polyploidy, leading to qualitative changes in the expression of specific genes and phenotypic variation (Gaeta et al., 2007). The genetic changes are likely to involve homoeoogous nonreciprocal transpositions (Gaeta et al., 2007). Natural B. napus may have evolved or inherited a locus controlling homoeologous recombination (Jenczewski et al., 2003), so such a high rate of genome change may not have occurred for long, if at all. We found no evidence within the B. napus genome of homoeologous exchanges (i.e., the genes in the B. napus A genome were most closely related to the genes in B. rapa, and the genes in the B. napus C genome were most closely related to the genes in B. oleracea).

Our studies were successful in providing estimates of the timing of the divergence of the A and C genomes as represented in B. napus and its progenitor species. These differed considerably between different regions of the B. napus genome, indicating that the genome of oilseed rape, as exemplified by var Tapidor, is likely to have been derived from multiple different progenitors with varying degrees of relatedness to B. oleracea ssp alboglabra A12DH and B. rapa ssp trilocularis RO18 or ssp pekinensis Chiifu. It is highly unlikely that we will be able to differentiate between the genome changes that occurred before the formation of B. napus and those which have occurred subsequently.

Our analyses confirm that interspersed gene fragments, first described in Brassica species in B. oleracea (Town et al., 2006), also occur in B. rapa and B. napus. These fragments contain introns so are of genomic origin. The process of incorporation into regions of the genome of unspliced fragments of unlinked cellular genes has been termed transduplication (Juretic et al., 2005) and has been observed to have been mediated by MULE, CACTA, and Helitron elements (Jiang et al., 2004; Lai et al., 2005; Zabala and Vodkin, 2005). Although the capture mechanism is not understood, it is likely a consequence of the rolling-circle mechanism of transposon replication (Feschotte and Wessler, 2001). The resulting insertions can contain fragments of many genes (Morgante et al., 2005). Although these are generally pseudogenes (Gupta et al., 2005), they frequently appear to be transcribed (Brunner et al., 2005). Transduplication of an apparently functional gene by a MULE has been reported in Arabidopsis (Hoen et al., 2006). Thus, both transposon-mediated assembly of novel genes and transposon-mediated dispersal of duplicates of functional genes to new positions within the genome have been described in plants. The Brassica genomes show evidence of the consequences of such genome evolutionary mechanisms and represent a new group of related plant species in which to study them. In addition, knowledge of these characteristics of Brassica genomes will be important for comparative genomic approaches for the exploitation of the emerging B. rapa genome sequence.

Our results are consistent with the plasticity of the genomes of Brassica species being similar to those of cereal genomes (Morgante, 2006). It seems likely that the genomes of many of the world's major crop species are evolving and diverging so quickly that we should expand our perspective to consider their pan-genomes, a concept that has been put forward for some bacterial species (Tettelin et al., 2005). The pan-genome comprises a core shared genome and a variable fraction partially shared between lines and acknowledges that the genome of a species is not fully represented by the genome sequence of any single line. Ongoing generation of genetic variation would be consistent with a hypothesis that the continued success in breeding improved varieties of crops such as oilseed rape and wheat (Triticum aestivum), despite very narrow genetic bases, is underpinned by the inherent properties of their genomes to evolve at the sequence level.

METHODS

BAC Sequencing

The KBrH138O03 clone, which was donated by the Korea Brassica Genome Resource Bank, was sequenced as described previously (Yang et al., 2006). The remaining BACs were sequenced essentially as described previously (Town et al., 2006).

Sequence Annotation

Gene predictions were made using Genemark.hmm (Lukashin and Borodovsky, 1998) version 3.3b 76 and the Arabidopsis thaliana matrix. This was also the default program used for gene calling both for Arabidopsis annotation (Haas et al., 2005) and the previous Brassica oleracea annotation (Town et al., 2006). Changes in the program since our previous annotation of B. oleracea contig E necessitated reannotating this contig with the newer version of the program to provide uniformity for comparisons across the contigs. Limited manual curation of gene models was performed to resolve inconsistencies between paralogous gene models uncovered during phylogenetic analysis. Gene models were assigned functions based upon database matches or HMM domain content as described previously. Gene predictions with similarity to known transposons were identified by searching against a curated set of transposon-encoded proteins (ftp://ftp.tigr.org/pub/data/TransposableElements/transposon_db.pep). Predicted proteins <100 amino acids in length with no database match were excluded from the final annotation.

VISTA plots (Frazer et al., 2004) were generated using the Web interface hosted at the Lawrence Berkeley Labs (http://genome.lbl.gov/vista/mvista/submit.shtml) using the AVID alignment option (Bray et al., 2003).

Phylogenic Analysis of Protein Families

Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 4 (Tamura et al., 2007). Protein sequences were aligned using ClustalW using the following default parameters: for pairwise alignments, a gap opening penalty = 10 and a gap extension penalty = 0.1; for multiple alignments, a gap opening penalty =10 and a gap extension penalty = 0.2. The protein weight matrix used was Gonnet with residue-specific penalties ON, hydrophilic penalties ON, a gap separation distance = 4, end gap separation OFF, use negative matrix OFF and delay divergent cutoff = 30%. Phylogenetic trees were constructed by the neighbor-joining method using default parameters as follows: gaps/missing data, complete deletion; model, amino: Poisson correction; substitutions to include, all; pattern among lineages, same; rates among sites, uniform. The alignments used for phylogenetic analysis are available as Supplemental Data Set 1 online.

Calculation of Ks Values

The analysis was performed by comparing sets of four BACs or BAC contigs representing Brassica napus A genome, Brassica rapa A genome, B. napus C genome, and B. oleracea C genome. Varying numbers of hypothetical genes families were involved in the analyses, and all the contigs were compared against one another for each hypothetical gene family. The Bioperl (Stajich et al., 2002) script bp_pairwise_kaks.pl was used to perform the analyses. The script works by taking as input two cDNA sequences that are going to be compared, translating these sequences to their corresponding protein sequences, aligning the protein sequences using ClustalW (Larkin et al., 2007), and then using the protein alignments together with the cDNA sequences to calculate the Ka, Ks, and Ka/Ks ratio by implementing the yn00 method (Yang and Nielsen, 2000), which is part of the PAML distribution (Yang, 2007).

Estimation of SNP and InDel Content

SNPs and Indels among the sequenced BACs were identified using MUMmer (Kurtz et al., 2004) with InDels identified between the genome segments by calculation of the difference in base pair coordinates of consecutive aligned SNP positions.

Accession Numbers

Sequence data from this article can be found in the EMBL/GenBank data libraries under the accession numbers listed in Table 1.

Supplemental Data

The following materials are available in the online version of this article.

  • Supplemental Figure 1. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig A, Using MUMmer, for B. napus C Genome versus B. oleracea.
  • Supplemental Figure 2. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig B, Using MUMmer, for B. napus A genome versus B. rapa.
  • Supplemental Figure 3. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig B, Using MUMmer, for B. napus C Genome versus B. oleracea.
  • Supplemental Figure 4. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig B, Using MUMmer, for B. rapa versus B. oleracea.
  • Supplemental Figure 5. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig B, Using MUMmer, for B. napus A Genome versus B. napus C Genome.
  • Supplemental Figure 6. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig C, Using MUMmer, for B. napus A Genome versus B. rapa.
  • Supplemental Figure 7. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig C, Using MUMmer, for B. napus C Genome versus B. oleracea.
  • Supplemental Figure 8. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig C, Using MUMmer, for B. rapa versus B. oleracea.
  • Supplemental Figure 9. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig C, Using MUMmer, for B. napus A Genome versus B. napus C Genome.
  • Supplemental Figure 10. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig D, Using MUMmer, for B. napus A Genome versus B. rapa.
  • Supplemental Figure 11. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig D, Using MUMmer, for B. napus C Genome versus B. oleracea.
  • Supplemental Figure 12. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig D, Using MUMmer, for B. rapa versus B. oleracea.
  • Supplemental Figure 13. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig D, Using MUMmer, for B. napus A Genome versus B. napus C Genome.
  • Supplemental Figure 14. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig F, Using MUMmer, for B. napus A Genome versus B. rapa.
  • Supplemental Figure 15. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig F, Using MUMmer, for B. napus C Genome versus B. oleracea.
  • Supplemental Figure 16. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig F, Using MUMmer, for B. rapa versus B. oleracea.
  • Supplemental Figure 17. Alignment of the Homoeologous Regions of the A and C Genomes, as Represented in Contig F, Using MUMmer, for B. napus A Genome versus B. napus C Genome.
  • Supplemental Figure 18. VISTA Plots Showing, for Contig A, the Sequence Relationships between the Contigs from B. napus C Genome as Reference and B. oleracea as Query.
  • Supplemental Figure 19. VISTA Plots Showing, for Contig A, the Sequence Relationships between the Contigs from B. oleracea as Reference and B. napus C Genome as Query.
  • Supplemental Figure 20. VISTA Plots Showing, for Contig B, the Sequence Relationships between the Contigs from B. napus C Genome as Reference and B. oleracea as Query.
  • Supplemental Figure 21. VISTA Plots Showing, for Contig B, the Sequence Relationships between the Contigs from B. oleracea as Reference and B. napus C Genome as Query.
  • Supplemental Figure 22. VISTA Plots Showing, for Contig B, the Sequence Relationships between the Contigs from B. napus A Genome as Reference and B. rapa as Query.
  • Supplemental Figure 23. VISTA Plots Showing, for Contig B, the Sequence Relationships between the Contigs from B. rapa as Reference and B. napus A Genome as Query.
  • Supplemental Figure 24. VISTA Plots Showing, for Contig C, the Sequence Relationships between the Contigs from B. napus C Genome as Reference and B. oleracea as Query.
  • Supplemental Figure 25. VISTA Plots Showing, for Contig C, the Sequence Relationships between the Contigs from B. oleracea as Reference and B. napus C Genome as Query.
  • Supplemental Figure 26. VISTA Plots Showing, for Contig C, the Sequence Relationships between the Contigs from B. napus A Genome as Reference and B. rapa as Query.
  • Supplemental Figure 27. VISTA Plots Showing, for Contig C, the Sequence Relationships between the Contigs from B. rapa as Reference and B. napus A genome as Query.
  • Supplemental Figure 28. VISTA Plots Showing, for Contig D, the Sequence Relationships between the Contigs from B. napus C Genome as Reference and B. oleracea as Query.
  • Supplemental Figure 29. VISTA Plots Showing, for Contig D, the Sequence Relationships between the Contigs from B. oleracea as Reference and B. napus C Genome as Query.
  • Supplemental Figure 30. VISTA Plots Showing, for Contig D, the Sequence Relationships between the Contigs from B. napus A Genome as Reference and B. rapa as Query.
  • Supplemental Figure 31. VISTA Plots Showing, for Contig D, the Sequence Relationships between the Contigs from B. rapa as Reference and B. napus A Genome as Query.
  • Supplemental Figure 32. VISTA Plots Showing, for Contig E, the Sequence Relationships between the Contigs from B. napus C Genome as Reference and B. oleracea as Query.
  • Supplemental Figure 33. VISTA Plots Showing, for Contig E, the Sequence Relationships between the Contigs from B. oleracea as Reference and B. napus C Genome as Query.
  • Supplemental Figure 34. VISTA Plots Showing, for Contig E, the Sequence Relationships between the Contigs from B. napus A Genome as Reference and B. rapa as Query.
  • Supplemental Figure 35. VISTA Plots Showing, for Contig E, the Sequence Relationships between the Contigs from B. rapa as Reference and B. napus A Genome as Query.
  • Supplemental Figure 36. VISTA Plots Showing, for Contig F, the Sequence Relationships between the Contigs from B. napus C Genome as Reference and B. oleracea as Query.
  • Supplemental Figure 37. VISTA Plots Showing, for Contig F, the Sequence Relationships between the Contigs from B. oleracea as Reference and B. napus C Genome as Query.
  • Supplemental Figure 38. VISTA Plots Showing, for Contig F, the Sequence Relationships between the Contigs from B. napus A Genome as Reference and B. rapa as Query.
  • Supplemental Figure 39. VISTA Plots Showing, for Contig F, the Sequence Relationships between the Contigs from B. rapa as Reference and B. napus A Genome as Query.
  • Supplemental Figure 40. Homologies to Arabidopsis Genes, as Identified by BLAST.
  • Supplemental Figure 41. Size Distribution, for the Contig A Region, of InDel Variation between B. rapa and B. napus A Genome Segments and between B. oleracea and B. napus C Genome Segments.
  • Supplemental Figure 42. Size Distribution, for the Contig B Region, of InDel Variation between B. rapa and B. napus A Genome Segments and between B. oleracea and B. napus C Genome Segments.
  • Supplemental Figure 43. Size Distribution, for the Contig C Region, of InDel Variation between B. rapa and B. napus A Genome Segments and between B. oleracea and B. napus C Genome Segments.
  • Supplemental Figure 44. Size Distribution, for the Contig D Region, of InDel Variation between B. rapa and B. napus A Genome Segments and between B. oleracea and B. napus C Genome Segments.
  • Supplemental Figure 45. Size Distribution, for the Contig F Region, of InDel Variation between B. rapa and B. napus A Genome Segments and between B. oleracea and B. napus C Genome Segments.
  • Supplemental Table 1. The Results of BLASTN Analysis of the B. rapa and B. napus Sequences Relative to Arabidopsis Gene Models, Presented in Excel Format.
  • Supplemental Data Set 1. Alignments Used for Phylogenetic Analysis.

Supplementary Material

[Supplemental Data]

Acknowledgments

We thank Paul Wilkinson of the University of Bath and members of the John Innes Centre Genome Laboratory for their contributions to the sequencing of the clones. This work was funded by the UK Biotechnology and Biological Sciences Research Council (BBS/B/07330, BB/E017363, and competitive support grant to the John Innes Centre). The research of Y.P.L., J.-Y.P., S.-J.K., and J.-A.K. was supported by grants from Rural Development Administration (BioGreen 21 Program 20050301034438 and National Academy of Agricultural Science Projects 2007139062200001502 and 200901FHT020710397), the Technology Development Program for Agriculture and Forestry, Ministry for Food, Agriculture, Forestry, and Fisheries (Project No. 607003-05), and National Institute of Agricultural Biotechnology (Project 04-1-12-2), Korea. J.C.P., C.T., and A.H.P. were supported by the U.S. National Science Foundation (DBI-0638536).

Notes

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: Ian Bancroft (ku.ca.crsbb@tforcnab.nai).

[W]Online version contains Web-only data.

[OA]Open access articles can be viewed online without a subscription.

www.plantcell.org/cgi/doi/10.1105/tpc.108.060376

References

  • Arabidopsis Genome Initiative (2000). Analysis of the genome of the flowering plant Arabidopsis thaliana. Nature 408 796–815. [PubMed]
  • Arumuganthan, K., and Earle, E.D. (1991). Nuclear DNA content of some important plant species. Plant Mol. Biol. Rep. 9 208–218.
  • Babula, D., Kaczmarek, M., Barakat, A., Delseny, M., Quiros, C.F., and Sadowski, J. (2003). Chromosomal mapping of Brassica oleracea based on ESTs from Arabidopsis thaliana: Complexity of the comparative map. Mol. Genet. Genomics 268 656–665. [PubMed]
  • Blanc, G., Barakat, A., Guyot, R., Cooke, R., and Delseny, M. (2000). Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell 12 1093–1101. [PMC free article] [PubMed]
  • Bowers, J.E., Chapman, B.A., Rong, J., and Paterson, A.H. (2003). Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422 433–438. [PubMed]
  • Bray, N., Dubchak, I., and Pachter, L. (2003). AVID: A global alignment program. Genome Res. 13 97–102. [PMC free article] [PubMed]
  • Brunner, S., Pea, G., and Rafalski, A. (2005). Origins, genetic organization and transcription of a family of non-autonomous helitron elements in maize. Plant J. 43 799–810. [PubMed]
  • Feschotte, C., and Wessler, S.R. (2001). Treasures in the attic: Rolling circle transposons discovered in eukaryotic genomes. Proc. Natl. Acad. Sci. USA 98 8923–8924. [PMC free article] [PubMed]
  • Frazer, K.A., Pachter, L., Poliakov, A., Rubin, E.M., and Dubchak, I. (2004). VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 32: W273–W279. [PMC free article] [PubMed]
  • Gaeta, R.T., Pires, J.C., Iniguez-Luy, F., Leon, E., and Osborn, T.C. (2007). Genomic changes in resynthesized Brassica napus and their effect on gene expression and phenotype. Plant Cell 19 3403–3417. [PMC free article] [PubMed]
  • Gupta, S., Gallavotti, A., Stryker, G.A., Schmidt, R.J., and Lal, S.K. (2005). A novel class of Helitron- related transposable elements in maize contain portions of multiple pseudogenes. Plant Mol. Biol. 57 115–127. [PubMed]
  • Haas, B.J., Wortman, J.R., Ronning, C.M., Hannick, L.I., Smith, R.K. Jr, Maiti, R., Chan, A.P., Yu, C., Farzad, M., Wu, D., White, O., and Town, C.D. (2005). Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release. BMC Biology 3 7. [PMC free article] [PubMed]
  • Hoen, D.R., Park, K.C., Elrouby, N., Yu, Z., Mohabir, N., Cowan, R.K., and Bureau, T.E. (2006). Transposon-mediated expansion and diversification of a family of ULP-like genes. Mol. Biol. Evol. 23 1254–1268. [PubMed]
  • Inaba, R., and Nishio, T. (2002). Phylogenetic analysis of Brassiceae based on the nucleotide sequences of the S-locus related gene, SLR1. Theor. Appl. Genet. 105 1159–1165. [PubMed]
  • Jenczewski, E., Eber, F., Grimaud, A., Huet, S., Lucas, M.O., Monod, H., and Chèvre, A.-M. (2003). PrBn, a major gene controlling homoeologous pairing in oilseed rape (Brassica napus) haploids. Genetics 164 645–653. [PMC free article] [PubMed]
  • Jiang, N., Bao, Z., Zhang, X., Eddy, S.R., and Wessler, S.R. (2004). Pack-MULE transposable elements mediate gene evolution in plants. Nature 431 569–573. [PubMed]
  • Juretic, N., Hoen, D.R., Huynh, M.L., Harrison, P.M., and Bureau, T.E. (2005). The evolutionary fate of MULE-mediated duplications of host gene fragments in rice. Genome Res. 15 1292–1297. [PMC free article] [PubMed]
  • Koch, M.A., Haubold, B., and Mitchell-Olds, T. (2000). Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol. Biol. Evol. 17 1483–1498. [PubMed]
  • Kowalski, S.D., Lan, T.-H., Feldmann, K.A., and Paterson, A.H. (1994). Comparative mapping of Arabidopsis thaliana and Brassica oleracea chromosomes reveals islands of conserved gene order. Genetics 138 499–510. [PMC free article] [PubMed]
  • Ku, H.-M., Vision, T., Liu, J., and Tanksley, S.D. (2000). Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl. Acad. Sci. USA 97 9121–9126. [PMC free article] [PubMed]
  • Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., and Salzberg, S.L. (2004). Versatile and open software for comparing large genomes. Genome Biol. 5 R12. [PMC free article] [PubMed]
  • Lagercrantz, U., and Lydiate, D. (1996). Comparative genome mapping in Brassica. Genetics 144 1903–1910. [PMC free article] [PubMed]
  • Lai, J., Li, Y., Messing, J., and Dooner, H.K. (2005). Gene movement by Helitron transposons contributes to the haplotype variability of maize. Proc. Natl. Acad. Sci. USA 102 9068–9073. [PMC free article] [PubMed]
  • Lan, T.H., DelMonte, T.A., Reischmann, K.P., Hyman, J., Kowalski, S., McFerson, J., Kresovich, S., and Paterson, A.H. (2000). An EST-enriched comparative map of Brassica oleracea and Arabidopsis thaliana. Genome Res. 10 776–788. [PMC free article] [PubMed]
  • Larkin, M.A., et al. (2007). ClustalW and ClustalX version 2. Bioinformatics 23 2947–2948. [PubMed]
  • Leitch, I.J., and Bennett, M.D. (1997). Polyploidy in angiosperms. Trends Plant Sci. 2 470–476.
  • Lukashin, A., and Borodovsky, M. (1998). GeneMark.hmm: New solutions for gene finding. Nucleic Acids Res. 26 1107–1115. [PMC free article] [PubMed]
  • Lukens, L., Zou, F., Lydiate, D., Parkin, I., and Osborn, T. (2003). Comparison of a Brassica oleracea genetic map with the genome of Arabidopsis thaliana. Genetics 164 359–372. [PMC free article] [PubMed]
  • Lukens, L.N., Quijada, P.A., Udall, J., Pires, J.C., Schranz, M.E., and Osborn, T.C. (2004). Genome redundancy and plasticity within ancient and recent Brassica crop species. Biol. J. Linn. Soc. Lond. 82 665–674.
  • Lysak, M.A., Koch, M.A., Pecinka, A., and Schubert, I. (2005). Chromosome triplication found across the tribe Brassiceae. Genome Res. 15 516–525. [PMC free article] [PubMed]
  • Mayer, K., et al. (2001). Conservation of microstructure between a sequenced region of the genome of rice and multiple segments of the genome of Arabidopsis thaliana. Genome Res. 11 1167–1174. [PMC free article] [PubMed]
  • Morgante, M. (2006). Plant genome organisation and diversity: The year of the junk! Curr. Opin. Biotechnol. 17 168–173. [PubMed]
  • Morgante, M., Brunner, S., Pea, G., Fengler, K., Zuccolo, A., and Rafalski, A. (2005). Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat. Genet. 37 997–1002. [PubMed]
  • O'Neill, C.M., and Bancroft, I. (2000). Comparative physical mapping of segments of the genome of Brassica oleracea var ssp. alboglabra that are homoeologous to sequenced regions of the chromosomes 4 and 5 of Arabidopsis thaliana. Plant J. 23 233–243. [PubMed]
  • Park, J.Y., et al. (2005). Physical mapping and microsynteny of Brassica rapa ssp. pekinensis genome corresponding to a 222 kb gene-rich region of Arabidopsis chromosome 4 and partially duplicated on chromosome 5. Mol. Genet. Genomics 274 579–588. [PubMed]
  • Parkin, I.A.P., Gulden, S.M., Sharpe, A.G., Lukens, L., Trick, M., Osborn, T.C., and Lydiate, D.J. (2005). Segmental structure of the Brassica napus genome based on comparative analysis with Arabidopsis thaliana. Genetics 171 765–781. [PMC free article] [PubMed]
  • Parkin, I.A.P., Sharpe, A.G., Keith, D.J., and Lydiate, D.J. (1995). Identification of the A and C genomes of amphidiploid Brassica napus (oilseed rape). Genome 38 1122–1131. [PubMed]
  • Parkin, I.A.P., Sharpe, A.G., and Lydiate, D.J. (2003). Patterns of genome duplication within the Brassica napus genome. Genome 46 291–303. [PubMed]
  • Paterson, A.H., Bowers, J.E., Burow, M.D., Draye, X., Elsik, C.G., Jiang, C., Katsar, C.S., Lan, T., Lin, Y., Ming, R., and Wright, R.J. (2000). Comparative genomics of plant chromosomes. Plant Cell 12 1523–1539. [PMC free article] [PubMed]
  • Qiu, D., et al. (2006). A comparative linkage map of oilseed rape and its use for QTL analysis of seed oil and erucic acid content. Theor. Appl. Genet. 114 67–80. [PubMed]
  • Rana, D., van den Boogaart, T., O'Neill, C.M., Hynes, L., Bent, E., Macpherson, L., Park, J.Y., Lim, Y.P., and Bancroft, I. (2004). Conservation of the microstructure of genome segments in Brassica napus and its diploid relatives. Plant J. 40 725–733. [PubMed]
  • Schranz, M.E., Lysak, M.A., and Mitchell-Olds, T. (2006). The ABC's of comparative genomics in the Brassicaceae: Building blocks of crucifer genomes. Trends Plant Sci. 11 535–542. [PubMed]
  • Schmidt, R., Acarkan, A., and Boivin, K. (2001). Comparative structural genomics in the Brassicaceae family. Plant Physiol. Biochem. 39 253–262.
  • Sharpe, A.G., Parkin, I.A.P., Keith, D.J., and Lydiate, D.J. (1995). Frequent non-reciprocal translocations in the amphidiploid genome of oilseed rape. Genome 38 1112–1121. [PubMed]
  • Song, K., Lu, P., Tang, K., and Osborn, T.C. (1995). Rapid genome change in synthetic polyploids of Brassica and its implications for polyploid evolution. Proc. Natl. Acad. Sci. USA 92 7719–7723. [PMC free article] [PubMed]
  • Stajich, J.E., et al. (2002). The Bioperl Toolkit: Perl modules for the life sciences. Genome Res. 12 1161–1168. [PMC free article] [PubMed]
  • Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007). MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24 1596–1599. [PubMed]
  • Tettelin, H., et al. (2005). Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”. Proc. Natl. Acad. Sci. USA 102 13950–13955. [PMC free article] [PubMed]
  • Town, C.D., et al. (2006). Comparative genomics of Brassica oleracea and Arabidopsis thaliana reveals gene loss, fragmentation and dispersal following polyploidy. Plant Cell 18 1348–1359. [PMC free article] [PubMed]
  • U, N. (1935). Genome analysis in Brassica with special reference to the experimental formation of B. napus and peculiar mode of fertilization. Jpn. J. Bot. 7: 389–452.
  • Udall, J., Quijada, P., and Osborn, T.C. (2004). Detection of chromosomal rearrangements derived from homologous recombination in four mapping populations of Brassica napus L. Genetics 169 967–979. [PMC free article] [PubMed]
  • Warwick, S.I., and Black, L.D. (1991). Molecular systematics of Brassica and allied genera (Subtribe Brassicinae, Brassiceae) – Chloroplast genome and cytodeme congruence. Theor. Appl. Genet. 82 81–92. [PubMed]
  • Wendel, J.F. (2000). Genome evolution in polyploids. Plant Mol. Biol. 42 225–249. [PubMed]
  • Wolfe, K.H., Gouy, M., Yang, Y.W., Sharp, P.M., and Li, W.H. (1989). Date of the monocot-dicot divergence estimated from the chloroplast DNA sequence data. Proc. Natl. Acad. Sci. USA 86 6201–6205. [PMC free article] [PubMed]
  • Yang, T.J., et al. (2006). Sequence-level analysis of the diploidization process in the triplicated FLC region of Brassica rapa. Plant Cell 18 1339–1347. [PMC free article] [PubMed]
  • Yang, Y.W., Lai, K.N., Tai, P.Y., and Li, W.H. (1999). Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J. Mol. Evol. 48 597–604. [PubMed]
  • Yang, Z., and Nielsen, R. (2000). Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol and Evol. 17 32–43. [PubMed]
  • Yang, Z. (2007). PAML 4: A program package for phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24 1586–1591. [PubMed]
  • Zabala, G., and Vodkin, L.O. (2005). The wp mutation of Glycine max carries a gene-fragment-rich transposon of the CACTA superfamily. Plant Cell 17 2619–2632. [PMC free article] [PubMed]

Articles from The Plant Cell are provided here courtesy of American Society of Plant Biologists
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...