• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Jun 13, 2006; 103(24): 9118–9123.
Published online Jun 5, 2006. doi:  10.1073/pnas.0601738103
PMCID: PMC1482576
Evolution

Positive selection driving diversification in plant secondary metabolism

Abstract

In Arabidopsis thaliana and related plants, glucosinolates are a major component in the blend of secondary metabolites and contribute to resistance against herbivorous insects. Methylthioalkylmalate synthases (MAM) encoded at the MAM gene cluster control an early step in the biosynthesis of glucosinolates and, therefore, are central to the diversification of glucosinolate metabolism. We sequenced bacterial artificial chromosomes containing the MAM cluster from several Arabidopsis relatives, conducted enzyme assays with heterologously expressed MAM genes, and analyzed MAM nucleotide variation patterns. Our results show that gene duplication, neofunctionalization, and positive selection provide the mechanism for biochemical adaptation in plant defense. These processes occur repeatedly in the history of the MAM gene family, indicating their fundamental importance for the evolution of plant metabolic diversity both within and among species.

Keywords: biochemical neofunctionalization, glucosinolate metabolism, methylthioalkylmalate synthase, plant–enemy interactions

Plants synthesize an immense number of secondary compounds, so called because their significance for processes of basic growth and development is not immediately evident. More than 200,000 known secondary metabolites provide an increasingly exploited reservoir for the generation of pharmaceutically active agents (1), and many more await discovery. Classic hypotheses that seek to explain this vast metabolic diversity propose a stepwise and reciprocal process of adaptation and counteradaptation between plants and their natural enemies, molded by mutual selection (2). In Arabidopsis thaliana and other crucifers, glucosinolates are a major component in the mélange of secondary metabolites. More than 120 glucosinolates are known, which share a chemical core structure but differ in their amino acid-derived side chain. Glucosinolate composition and quantity varies among and within species (35). Upon tissue disruption, myrosinase-catalyzed hydrolysis of glucosinolates generates biologically active compounds, which play an important ecological role in plant defense against herbivorous insects (4, 6–9). However, insects can adapt to glucosinolate profiles or evade deleterious effects from glucosinolate hydrolysis by counteradaptation (912). In A. thaliana, a modular genetic system regulates diversity in glucosinolate profiles among accessions and may permit the rapid generation of new glucosinolate combinations in response to challenges imposed by the biotic environment (3). Several genetic loci (3, 9), each present in several alleles, interact epistatically and, together with modifying proteins (7), determine the blend of bioactive products that emerges during glucosinolate hydrolysis (9). One genetic locus central for glucosinolate diversity, the methylthioalkylmalate synthase gene (MAM) cluster, controls an early step in the biosynthesis of aliphatic glucosinolates, the predominant glucosinolate class in A. thaliana. This locus comprises a small family of MAM genes, encoding methylthioalkylmalate synthases that condense ω-methylthio-2-oxoalkanoic acids derived from methionine with acetyl-CoA in a reaction series that extends the ω-methylthio-2-oxoalkanoic acid by one methylene group with each reaction cycle (13) (Fig. 5, which is published as supporting information on the PNAS web site). Thereby, they provide a diverse mixture of precursors for subsequent biosynthetic reaction steps. In A. thaliana, the archetypical configuration of the MAM cluster comprises three genes, MAM2, MAM1, and MAM-L. However, this composition varies among accessions because of gene deletion and conversion events, causing a quantitative trait locus for resistance against herbivorous insects (4). MAM-L provides precursors for aliphatic glucosinolates with long side chains (14), whereas MAM2 and MAM1 control short-chain glucosinolates (4, 13, 15).

Here, we conduct MAM enzyme assays and analyze MAM sequence variation patterns in a phylogenetic framework to further elucidate the evolutionary forces that lead to functional divergence in the MAM gene cluster.

Results

Comparative Sequencing and Phylogenetic Analyses Reveal Complex Relationships of MAM Genes Among Arabidopsis Relatives.

We sequenced the MAM region from several close A. thaliana relatives, Arabidopsis lyrata ssp. lyrata, Arabidopsis lyrata ssp. petraea, and Arabidopsis cebennensis, and the more distantly related Boechera divaricarpa (16). We found that the region is colinear between A. thaliana and its relatives (Fig. 1), consistent with other studies demonstrating extensive synteny between cruciferous genomes (1720). In A. lyrata, A. petraea, A. cebennensis, and B. divaricarpa, the MAM cluster consists of three genes, which we designate MAMa, MAMb, and MAMc. Phylogenetic analyses (Fig. 2; see Fig. 6, which is published as supporting information on the PNAS web site) show that A. thaliana MAM1 and MAM2 originate from a MAMa gene duplication after A. thaliana diverged from its congeners. Several estimates of evolutionary rates in the Brassicaceae are in rough agreement (16, 21). From these rates, we concluded that MAM1 and MAM2 have existed in A. thaliana for >105 generations. Duplication events leading to tandemly linked MAMa, MAMb, and MAMc are much older and predate the divergence of Arabidopsis and Boechera genera >10 million years ago (16). A. thaliana MAM-L is orthologous to MAMb, and MAMc has been lost in A. thaliana.

Fig. 1.
The MAM region in A. thaliana and relatives. The diverse configuration of the MAM cluster in different A. thaliana accessions (4) is represented by Col-0, Ler-0, and Sorbo. Sorbo exemplifies the archetypical configuration of the locus in A. thaliana, ...
Fig. 2.
Phylogenetic relationships between MAM genes, according to the neighbor-joining method. The significance of the branching order was tested by bootstrapping (2,000 replicates). Isopropylmalate synthase genes from A. thaliana (At1g18500 and At1g74040; ref. ...

Enzyme Assays Reveal MAM1 Biochemical Neofunctionalization.

The MAM1 gene product from the A. thaliana accession Columbia (Col-0) catalyzes the condensation of both 4-methylthio-2-oxobutanoic acid (13) and 5-methylthio-2-oxopentanoic acid (15) with acetyl-CoA (Fig. 5). Thus, Col-0 MAM1 controls the committed step in both the first and second round of methionine carbon chain elongation, ultimately leading to the formation of aliphatic glucosinolates with four methylene groups (4C) in their side chain. We expressed MAM2 from the A. thaliana accession Landsberg erecta (Ler-0), MAMa from A. lyrata, and Boechera stricta (closely related to B. divaricarpa) in Escherichia coli and tested whether the encoded proteins accepted ω-methylthio-2-oxoalkanoic acids of different carbon chain lengths as substrates for condensation with acetyl-CoA. Likewise, we tested MAM1 from A. thaliana Sorbo, which differed slightly from the original Col-0 MAM1 expression construct (13, 15). We found that Ler-0 MAM2, A. lyrata, and B. stricta MAMa all catalyzed the condensation of 4-methylthio-2-oxobutanoic acid with acetyl-CoA, but none accepted 5-methylthio-2-oxopentanoic acid or 6-methylthio-2-oxohexanoic acid as a substrate. Thus, Ler-0 MAM2, A. lyrata MAMa, and B. stricta MAMa function only in the first methionine chain elongation cycle (from which 3C glucosinolates originate) but not in subsequent cycles. In contrast, Sorbo MAM1 (like Col-0 MAM1) catalyzed the condensation reactions of both the first and second methionine carbon chain elongation (Fig. 5). We also detected a very minor activity with 6-methylthio-2-oxohexanoic acid, and reevaluation of Col-0 MAM1 confirmed this finding. Thus, subsequent to MAMa duplication, MAM2 retained the original MAMa function, the condensation reaction in the first methionine chain extension cycle, whereas MAM1 acquired additional enzymatic capacity to function in subsequent extension cycles, equivalent to a biochemical neofunctionalization.

A comparison between MAM1, MAM2, and MAMa (Fig. 4) shows most amino acid polymorphisms that distinguish MAM1 from MAM2 and MAMa clustering in the N-terminal portion of the mature protein, suggesting functional importance for substrate specificity. We therefore swapped these domains between MAM1 and MAM2 expression constructs (Fig. 4). The chimeric protein encoded by the 5′-MAM1-MAM2-3′ construct was capable of condensing both 4-methylthio-2-oxobutanoic acid and 5-methylthio-2-oxopentanoic acid with acetyl-CoA, whereas expressed 5′-MAM2-MAM1-3′ only worked with 4-methylthio-2-oxobutanoic acid, thus confirming that the N-terminal portions of mature MAM1 and MAM2 indeed control substrate specificity.

Fig. 4.
Amino acid polymorphisms between MAM synthases. Shown are polymorphic amino acids from position 82 onward, i.e., signal peptides are excluded. Amino acids shared between A. lyrata MAMa and other MAM synthases are shaded gray. Residues unique to MAM1 are ...

Evolutionary Analyses Detect Positive Selection Acting on MAM1 After Duplication of an Ancestral MAMa Sequence.

To infer whether selection acted on MAM1 and MAM2 subsequent to the MAMa duplication, we used several statistical tests from molecular evolution. Both MAM1 and MAM2 exhibited an excess of synonymous relative to nonsynonymous changes (Fig. 7, which is published as supporting information on the PNAS web site), indicating purifying selection. This excess of synonymous changes may reflect general conservation of the 3D structural framework enabling MAM enzyme activity. Nonetheless, when we compared MAM1 and MAM2 to A. lyrata or B. divaricarpa MAMa, a modified McDonald–Kreitman test (22) showed the ratio of replacement to synonymous nucleotide differences between MAM1 and MAM2 is significantly higher than between species (Table 1), suggesting natural selection to diversify biochemical function. In particular, sliding window analyses revealed an elevated level of amino acid substitutions in the N-terminal portion of MAM1 (Fig. 7), which controls substrate specificity. Likewise, we examined ratios of replacement to synonymous changes along the branches of a tree connecting A. lyrata MAMa with A. thaliana MAM1 and MAM2 and tested ratios with a heterogeneity G test. We found heterogeneous ratios (χ2 = 10.76; df = 2; P = 0.0046) because of an excess of replacement changes on branch b (Fig. 3), leading from the MAMa duplication to MAM1. Finally, we analyzed codon substitution patterns with a maximum likelihood approach implementing a branch-site model (2325). Models that allowed for positive selection acting on MAM1 subsequent to the MAMa duplication fit the data significantly better than a nearly neutral model that allowed for neutral or relaxed but not positive selection (Table 2; see also Table 3, which is published as supporting information on the PNAS web site). Moreover, PAML indicated positive selection for most amino acids that contrast MAM1 from MAM2 or MAMa (Fig. 4 and Table 3).

Fig. 3.
Gene duplication and speciation. Based on parsimony, we partitioned nonsynonymous (N) and synonymous changes (S) along branches of a simplified phylogeny connecting A. lyrata MAMa, A. thaliana Sorbo and Col-0 MAM1, and Sorbo and Ler-0 MAM2. We excluded ...
Table 1.
McDonald–Kreitman tests for selection
Table 2.
Analysis of codon substitution patterns

In all investigated species, the MAM cluster consists of several gene copies, reflecting ancestral gene duplication events. Therefore, we also tested a data set comprising MAMa, MAMb, and MAMc sequences from our species survey for potential positive selection. Again, models that allowed for positive selection subsequent to gene duplication events (i.e., along basal branches) explained the data significantly better than models that incorporated only neutral evolution or models that allowed for positive selection only on distal branches of the tree (Table 2; see also Table 4, which is published as supporting information on the PNAS web site).

Discussion

We provide evidence that MAM1 and MAM2 originate from a duplication of MAMa and that MAM1, but not MAM2, has been subject to positive selection. We surveyed the enzymes’ biochemical properties, e.g., pH-, ATP-, and metal ion dependency (Table 5 and Fig. 8, which are published as supporting information on the PNAS web site) and found no major difference except for the disparity in substrate acceptance between MAMa, MAM2 (both acting only on 2C substrates), and MAM1 (accepting 2C and 3C substrates and minor amounts of 4C substrate). Our in vitro assays are supported by in planta data; accessions like Col-0 or Sorbo with a functional MAM1 gene accumulate aliphatic glucosinolates derived from dihomomethionine (4C), whereas accessions such as Ler-0 that lack a functional MAM1 (Fig. 1) contain predominantly homomethionine-derived (3C) aliphatic glucosinolates (4, 13). Therefore, we conclude that the additional enzymatic capacity of MAM1 was the target of positive selection.

What are the consequences of this biochemical neofunctionalization? Enzymes encoded at the MAM locus are central to the diversification of glucosinolate metabolism because they determine variability at its earliest stage, thereby providing the material for all other enzymes acting at later stages. Most of these modifying enzymes already are present in distant A. thaliana relatives, including Brassica (9), which shared a common ancestor with A. thaliana some 20 million years ago (16). Because glucosinolate biosynthetic and hydrolytic enzymes together determine the outcome of myrosinase-catalyzed glucosinolate hydrolysis, an expansion of MAM substrate specificity has dramatic effects on the composition of the final, bioactive hydrolysis products, which govern the encounter of cruciferous plants with herbivorous insects and other biotic stressors (9).

However, it is worth noting that MAM1 has not lost the original MAMa function, in contrast to simple models of neofunctionalization that often assume that one copy of a newly arisen gene duplicate loses its original activity in the process of acquiring a novel function. Preservation of the ancestral MAMa activity in MAM1 permits the generation of 3C glucosinolate precursors in the absence of a functional MAM2. These 3C precursors are required as intermediates in the production of 4C glucosinolates (Fig. 5), and in accessions with a functional MAM1, only low quantities of 3C glucosinolates are detectable, irrespective of whether a functional MAM2 is present (3, 4). Hence, the novel MAM1 function leads to an alteration in whole-plant phenotype compared with plants that harbor a MAMa gene. Furthermore, MAM2 also is preserved in contemporary A. thaliana, although its entire activity is covered by the MAM1 gene product. How may this situation be explained? In a recent survey of A. thaliana MAM sequences (4), we detected accessions with a partial or complete deletion of either MAM1 or MAM2. In addition, we found evidence for genetic interchange between MAM1 and MAM2 genes. In some accessions, the gene at the MAM1 locus had been converted partially to a MAM2-like gene. These accessions accumulated 3C glucosinolates when the N-terminal portion of the gene product encoded at the MAM1 locus was affected, consistent with our biochemical analyses of heterologously expressed proteins. In other accessions, the gene at the MAM2 locus had been converted partially to a MAM1-like gene. As expected, these accessions accumulated 4C glucosinolates. Genetic interchanges between MAM1 and MAM2 genes and gene deletion events have occurred frequently; the ancestral MAM2/MAM1 configuration was retained in only one of the surveyed accessions (Sorbo). These processes reflect dynamic evolution of the MAM locus in A. thaliana, possibly as a response to fluctuating selection pressure from a diverse community of enemy populations.

In their seminal paper from 1964, Ehrlich and Raven (2) raised the fundamental question of how plants and their natural enemies coevolve. Can gene duplication, neofunctionalization, and positive selection (2629) explain the diversification of plant secondary metabolism in the context of challenges imposed by the environment? Numerous multigene families contribute to plant secondary metabolism, and in many cases, distinct (but related) biochemical activities have been reported for different gene family members. Typical examples include A. thaliana and rice cytochrome P450s (30), maize terpene synthases (31), and morning glory chalcone synthases (32). Although selective forces acting on functional diversification have not been elucidated in most of these cases, these and other examples suggest that the processes shaping the evolution of the MAM gene cluster indeed may be generalized as the mechanism for the evolution of plant metabolic diversity.

Furthermore, patterns and dynamics of variation in the MAM gene cluster in Arabidopsis and relatives parallel those observed for disease-resistance genes (R genes) (3337). Gene duplication and deletion events occur frequently in R gene clusters, reflecting a birth-and-death process, and selective forces act to diversify R gene function. Both classes of genes are involved in plant–enemy interactions, R genes in resistance against diseases, and MAM genes, as part of the glucosinolate-myrosinase system, in resistance against herbivorous insects. Hence, strikingly similar processes contribute to diversification and dynamics in these two unrelated classes of genes, indicating that the same fundamental evolutionary processes apply to genes that contribute to plant–biotic interactions.

Materials and Methods

Sequencing.

We hybridized nylon filters representing total genomic bacterial artificial chromosome libraries from A. lyrata ssp. lyrata (38), A. lyrata ssp. petraea (Plech/Fränkische Schweiz, Germany) and A. cebennensis (Massive Centrale, France) (both Keygene, Wageningen, The Netherlands), and B. divaricarpa (Vipond Park, MO; 45° 40′ 57′′ N, 112° 53′ 53′′ W) with probes from A. thaliana MAM1, MAM-L, and flanking genes and isolated bacterial artificial chromosomes yielding strong hybridization signals with NucleoBond BAC 100 kits (Macherey-Nagel, Düren, Germany) according to the manufacturer’s instructions. We treated isolates with Plasmid-Safe ATP-Dependent DNase (Epicentre Technologies, Madison, WI) to remove contaminating E. coli DNA, partially digested with Sau3AI or Tsp509I, and ligated to BamHI- or EcoRI-restricted and dephosphorylated pUC19 or pBlueskriptII SK vectors. We conducted plasmid preparations with NucleoSpin Robot-96 Plasmid kits (Macherey-Nagel) on a BIOMEK2000 robotic platform (Beckman Coulter). For sequence assembly, we used the dnastar (DNASTAR, Madison, WI) software package.

Cloning of MAM cDNAs.

We isolated total plant RNA from A. thaliana Ler-0 and Sorbo, A. lyrata, and Boechera stricta SAD12 (Gunnison County, CO) by a guanidinium thiocyanate-phenol-chloroform extraction protocol, enriched for mRNA by using the Nucleo Trap mRNA Mini Kit (Macherey-Nagel) and reverse-transcribed with SuperScript III Reverse Transcriptase (Invitrogen) according to the manufacturer’s instructions. We amplified MAM cDNAs with AccuPrime Taq polymerase (Invitrogen) by using the following primer pairs: Ler-0 MAM2, 5′-ATGTCATRTTGCTCTTCTGTG-3′ and 5′-CACATTAGATGAAACCTGAA-3′; Sorbo MAM1, 5′-ATGTCATRTTGCTCTTCTGTG-3′ and 5′-CACATTCGAYGAAACCTGA-3′; A. lyrata MAMa, 5′-ATGTCATGTTGCTCTTCTGTG-3′ and 5′-TAGCATATTTGATGAAACCT-3′; B. stricta MAMa, 5′-ATGTCATGTTGCTCCTCTGA-3′ and 5′-GAGGGAAACCTGAGGAAC-3′. We chose primer pairs such that expressed gene products lacked N-terminal signal sequence (13) but contained a C-terminal His tag extension encoded by the pCR T7/CT TOPO (Invitrogen) expression vector. PCR consisted of an initial denaturation at 94°C for 1 min, and 33 cycles at 94°C for 15 s, 55°C for 40 s, and 72°C for 2 min, followed by a final elongation at 72°C for 6 min on a PE Applied Biosystems 9700 thermal cycler. We separated PCR products on 1% agarose gels, excised them, and gel-purified by using the NucleoSpin Extract Kit (Macherey-Nagel) according to the manufacturer’s instructions. We cloned purified cDNA into pCR T7/CT TOPO, and electrotransformed into E. coli TOP-10 cells (Invitrogen). For each MAMcDNA, we isolated and sequenced several plasmid clones to confirm sequence integrity, reading frame, and direction within the vector.

To test which portion of the A. thaliana sequences determined substrate specificity of MAM1 versus MAM2, we excised nucleotides 163-1031 from a functional Sorbo MAM1 expression construct with BamH1/BsmI and ligated it with the vector-containing portion of an identically treated functional Ler-0 MAM2 construct and vice versa. Thus, we obtained two fusion constructs, one with partial Sorbo MAM1 cloned into Ler-0 MAM2 sequence and the second with partial Ler-0 MAM2 cloned into Sorbo MAM1 sequence.

Heterologous Expression.

We transformed MAM constructs into E. coli BL21 (DE3) (Invitrogen) heterologously in OvernightExpress Instant TB Medium (Novagen) according to the manufacturer’s instructions. We isolated total soluble protein with the BugBuster/Lysonase system (Novagen), and we determined protein concentration with the Dc Protein Assay (Bio-Rad). We confirmed presence, size, and expression intensity of the target protein with SDS/PAGE and Western blot analysis.

Enzyme Assays.

Enzyme assays followed the basic protocol described in ref. 13 and modified according to ref. 15. To determine pH, ATP, and metal ion dependencies, we carried out MAM activity assays with 80 mM Tris·HCl (pH 8.0)/3 mM 4-methylthio-2-oxobutanoic acid (the 2C substrate, see Fig. 5)/1 mM acetyl-CoA/4 mM Mn2+ as initial conditions. These conditions work well with A. thaliana Col-0 MAM1 (13, 15). Then, we measured pH dependency between pH 6.5 and 9.0 in incremental steps of 0.25 (Table 5). For subsequent assays, we used the pH optimum determined for each single MAM synthase. Next, we optimized ATP concentration by measuring ATP dependency between 0.0 and 6.0 mM in incremental steps of 0.5 mM. Again, in subsequent assays, we chose ATP optima determined for each single MAM synthase. Finally, we measured Mn2+ dependency in concentrations between 0.0 and 9.6 mM in incremental steps of 0.8 mM. We replicated each assay at least three different times. We then used these optimal assay conditions for each MAM synthase to determine activity with the native 2C, 3C, and 4C substrates (4-methylthio-2-oxobutanoic acid, 5-methylthio-2-oxopentanoic acid, and 6-methylthio-2-hexanoic acid, respectively; see Figs. 5 and 8) and with the artificial 2C, 3C, and 4C substrates (2-oxohexanoic acid, 2-oxoheptanoic acid, and 2-oxooctanoic acid, respectively; see Fig. 8). We used the artificial substrates, with the thiol group in the side chain replaced by a methylene function, to confirm the pattern of in vitro substrate specificity revealed by the use of the natural substrates. These analogues were shown previously to be accepted by A. thaliana Col-0 MAM1 at efficiencies slightly less than those of the native substrates (15). In addition, we measured MAM activity with different metal anions (Ca2+, Co2+, Cu2+, and Mg2+; all added as Me2+Cl2) by using MAM synthase-specific pH and ATP optima and compared it to MAM activity with Mn2+ (Table 5). In these assays, all metal anions were added to a final assay concentration of 5.6 mM. Again, each experiment was repeated at least three times. Because of the large number of replicated enzyme assays and because we found no difference in substrate specificity between crude protein extracts and purified proteins, we mostly used crude protein extracts from heterologous expression for our analyses, with a typical assay containing between 20% and 30% target protein in 100 μg of crude protein.

Enzyme Purification.

We purified MAM proteins by using the His·Bind column and buffer system (Novagen). We used Vivaspin 15R centrifuge concentrator units with a molecular weight cut off of 10,000 (Sartorius) for rebuffering and concentrating purified protein.

Evolutionary Analyses.

We used clustalx 1.8 to align MAM protein sequences and corrected gap locations manually. We removed alignment positions corresponding to gaps comprising >1 aa in any sequence and converted the amino acid alignment to a nucleotide alignment. We used this data set for all phylogenetic analyses. We used treecon (39) for the construction of a neighbor-joining tree, paup 4.0b10 (40) for maximum likelihood estimation (41), and mrbayes 3.1 (42) for Bayesian inference (43). For both maximum likelihood and Bayesian inference, we assumed six substitution types and based our calculations on empirical nucleotide frequencies. We approximated rate distribution at variable sites by a gamma distribution with four rate categories and an estimated shape parameter α. These settings correspond to the general time reversible with gamma estimation model. We obtained the maximum likelihood seed tree by stepwise addition and used tree-bisection-reconnection as the branch swapping algorithm. We performed Bayesian analysis by simultaneous calculation of four Markov chain Monte Carlo runs, each with four chains of 5 million iterations. mrbayes default settings were used to calculate prior probabilities. Irrespective of whether we used distance matrix, maximum likelihood, or Bayesian inference methods, we obtained the same tree branching order.

dnasp 4.00 (44) was used to calculate replacement and synonymous nucleotide polymorphisms.

Analysis of Codon Substitution Patterns.

We used paml3.14(24) to analyze codon substitution patterns with a maximum likelihood approach, implementing a branch-site model (Model A) (2325). To test for potential positive selection subsequent to the MAMa gene duplication, we used two different alignment files, “MAM1-MAM2-MAMa” and “MAM1-MAM2-MAMa trunc” (Table 3). The first file includes all MAM nucleotide positions, whereas the second alignment file excludes nucleotides corresponding to MAM chloroplast targeting signals. Likewise, we used two different alignment files, “MAMa-MAMb-MAMc” and “MAMa-MAMb-MAMc trunc” to test for positive selection along ancestral phylogenetic branches (Table 4). These alignments are slightly shorter (incorporating alignment positions 1–1461 or 142-1461, respectively) because of a mutation in Boechera divaricarpa that leads to a MAMc protein with a truncated C terminus. In Table 3, “ColMAM1” and “SorboMAM1” refer to MAM1 sequences from A. thaliana accessions Col-0 and Sorbo, respectively. “LerMAM2,” “SorboMAM2,” and “TsuMAM2” refer to MAM2 sequences from A. thaliana accessions Ler-0, Sorbo, and Tsu-0. “ALMAMa,” “APMAMa,” “ACMAMa,” and “BDMAMa” refer to MAMa genes from A. lyrata ssp. lyrata, A. lyrata ssp. petraea, A. cebennensis, and B. divaricarpa, respectively. In Table 4, MAMa, MAMb, or MAMc genes are indicated by similar acronyms.

We calculated log likelihoods for nearly neutral models (Model M1a; allowing for neutral or relaxed but not positive selection) and for models that incorporate positive selection [allowing several nonsynonymous substitutions per nonsynonymous site/synonymous substitutions per synonymous site (Dn/Ds) ratios for branches]. We tested different codon frequency models (F equal, equal codon frequencies; F1 × 4, codon frequencies calculated from the average nucleotide frequencies; F3 × 4, codon frequencies calculated from the average nucleotide frequencies at the three codon positions; F codon, codon frequencies estimated as free parameters) but found that different codon frequency models had little influence on the results (Tables 3 and 4). We also tested different branches for positive selection (shown in column “TREE” as an unrooted tree in parenthesis notation). Symbols “#1” and “$1” indicate foreground branches tested for potential positive selection. With #1, a single branch is tested for potential positive selection, whereas with $1, all branches within a clade are tested for positive selection. “Proportion” and “ω” indicate the proportion of codons experiencing a particular Dn/Ds ratio. Note that nearly neutral models have only two site classes (with 0 < ω < 1, indicating sites under purifying or relaxed selection, and with ω = 1, indicating neutrally evolving sites), whereas models that incorporate positive selection have four site classes. “lnL” gives the log likelihood of a particular model and is used for a likelihood ratio test to compare models incorporating positive selection versus a nearly neutral model (2*ΔlnL). We conducted two different likelihood ratio tests (Tests 1 and 2). When Test 1 suggested positive selection, we conducted a second, very conservative likelihood ratio test. For this second test (Test 2), we fixed the foreground ω = 1 in the site-class model and compared its log likelihood with the nearly neutral model. We considered models incorporating positive selection as explaining the data significantly better than nearly neutral models only when both tests indicated positive selection.

Supplementary Material

Supporting Information:

Acknowledgments

We thank Domenica Schnabelrauch and Susanne Donnerhacke for excellent technical assistance, Heiko Vogel for discussion, and two anonymous reviewers for constructive comments and criticism on a previous version of the manuscript. This work was supported by the Max Planck Society and by Deutsche Forschungsgemeinschaft Grants KR2237/2-1 and KR2237/2-2 (to J.K.).

Abbreviation

MAM
methylthioalkylmalate synthase.

Footnotes

Conflict of interest statement: No conflicts declared.

This paper was submitted directly (Track II) to the PNAS office.

Data deposition: The sequences reported in this paper have been deposited in the EMBL database (accession nos. AM180569AM180585).

References

1. Hartmann T., Kutchan T. M., Strack D. Phytochemistry. 2005;66:1193–1194.
2. Ehrlich P. R., Raven P. H. Evolution (Lawrence, Kans.) 1964;18:586–608.
3. Kliebenstein D. J., Kroymann J., Brown P., Figuth A., Pedersen D., Gershenzon J., Mitchell-Olds T. Plant Phys. 2001;126:811–825. [PMC free article] [PubMed]
4. Kroymann J., Donnerhacke S., Schnabelrauch D., Mitchell-Olds T. Proc. Natl. Acad. Sci. USA. 2003;100:14587–14592. [PMC free article] [PubMed]
5. Windsor A. J., Reichelt M., Figuth A., Svatoš A., Kroymann J., Kliebenstein D. J., Gershenzon J., Mitchell-Olds T. Phytochemistry. 2005;66:1321–1333. [PubMed]
6. Mauricio R., Rausher M. D. Evolution (Lawrence, Kans.) 1997;51:1435–1444.
7. Lambrix V., Reichelt M., Mitchell-Olds T., Kliebenstein D. J., Gershenzon J. Plant Cell. 2001;13:2793–2807. [PMC free article] [PubMed]
8. Raybold A. F., Moyes C. L. Heredity. 2001;87:383–391. [PubMed]
9. Kliebenstein D. J., Kroymann J., Mitchell-Olds T. Curr. Opin. Plant Biol. 2005;8:264–271. [PubMed]
10. Blau P. A., Feeny P., Contardo L., Robson D. S. Science. 1978;200:1296–1298. [PubMed]
11. Ratzka A., Vogel H., Kliebenstein D. J., Mitchell-Olds T., Kroymann J. Proc. Natl. Acad. Sci. USA. 2002;99:11223–11228. [PMC free article] [PubMed]
12. Wittstock U., Agerbirk N., Stauber E. J., Olsen C. E., Hippler M., Mitchell-Olds T., Gershenzon J., Vogel H. Proc. Natl. Acad. Sci. USA. 2004;101:4859–4864. [PMC free article] [PubMed]
13. Kroymann J., Textor S., Tokuhisa J. G., Falk K. L., Bartram S., Gershenzon J., Mitchell-Olds T. Plant Phys. 2001;127:1077–1088. [PMC free article] [PubMed]
14. Field B., Cardon G., Traka M., Botterman J., Vancanneyt G., Mithen R. Plant Phys. 2004;135:828–839. [PMC free article] [PubMed]
15. Textor S., Bartram S., Kroymann J., Falk K. L., Hick A., Pickett J. A., Gershenzon J. Planta. 2004;218:1026–1035. [PubMed]
16. Koch M. A., Haubold B., Mitchell-Olds T. Am. J. Bot. 2001;88:534–544. [PubMed]
17. Rossberg M., Theres K., Acarcan A., Herrero R., Schmitt T., Schumacher K., Schmitz G., Schmidt R. Plant Cell. 2001;13:979–988. [PMC free article] [PubMed]
18. Boivin K., Acarkan A., Mbulu R. S., Clarenz O., Schmidt R. Plant Phys. 2004;135:735–744. [PMC free article] [PubMed]
19. Kuittinen H., de Haan A., Vogl C., Oikarinen S., Leppälä J., Koch M. A., Mitchell-Olds T., Langley C., Savolainen O. Genetics. 2004;168:1575–1584. [PMC free article] [PubMed]
20. Windsor A. J., Schranz M. E., Formanová N., Gebauer-Jung S., Bishop J., Schnabelrauch S., Kroymann J., Mitchell-Olds T. Plant Phys. 2006;140:1169–1181.
21. Yang Y. W., Lai K. N., Tai P. Y., Li W. H. J. Mol. Evol. 1999;48:597–604. [PubMed]
22. McDonald J. H., Kreitman M. Nature. 1991;351:652–654. [PubMed]
23. Yang Z., Nielsen R. Mol. Biol. Evol. 2002;19:908–917. [PubMed]
24. Yang Z. Comput. Appl. Biosci. 1997;13:555–556. [PubMed]
25. Yang Z., Wong W. S. W., Nielsen R. Mol. Biol. Evol. 2005;22:1107–1118. [PubMed]
26. Ohno S. Evolution by Gene Duplication. Berlin: Springer; 1970.
27. Lynch M., Conery J. S. Science. 2000;290:1151–1155. [PubMed]
28. Lynch M., O’Hely M., Walsh B., Force A. Genetics. 2001;159:1789–1804. [PMC free article] [PubMed]
29. Moore R. C., Purugganan M. D. Proc. Natl. Acad. Sci. USA. 2003;100:15682–15687. [PMC free article] [PubMed]
30. Nelson D. R., Schuler M. A., Paquette S. M., Werck-Reichhard D., Bak S. Plant Phys. 2004;135:756–772. [PMC free article] [PubMed]
31. Koellner T. G., Schnee C., Gershenzon J., Degenhardt J. Plant Cell. 2004;16:1115–1131. [PMC free article] [PubMed]
32. Durbin M. L., Learn G. H., Jr., Huttley G. A., Clegg M. T. Proc. Natl. Acad. Sci. USA. 1995;92:3338–3342. [PMC free article] [PubMed]
33. Michelmore R. W., Meyers B. C. Genome Res. 1998;8:1113–1130. [PubMed]
34. Stahl E. A., Dwyer G., Mauricio R., Kreitman M., Bergelson J. Nature. 1999;400:667–671. [PubMed]
35. Bergelson J., Kreitman M., Stahl E. A., Tian D. Science. 2001;292:2281–2285. [PubMed]
36. Mondragón-Palomino M., Meyers B. C., Michelmore R. W., Gaut B. S. Genome Res. 2002;12:1305–1315. [PMC free article] [PubMed]
37. Kuang H., Woo S. S., Meyers B. C., Nevo E., Michelmore R. W. Plant Cell. 2004;16:2870–2894. [PMC free article] [PubMed]
38. Kusaba M., Dwyer K., Hendershot J., Vrebalov J., Nasrallah J. B., Nasralla M. E. Plant Cell. 2001;13:627–643. [PMC free article] [PubMed]
39. Van de Peer Y., De Wachter R. Comput. Appl. Biosci. 1997;13:227–230. [PubMed]
40. Swofford D. L. PAUP: Phylogenetic Analysis Using Parsimony. Sunderland, MA: Sinauer; 2003. Version 4.
41. Yang Z. J. Mol. Evol. 1994;39:306–314. [PubMed]
42. Ronquist F., Huelsenbeck J. P. Bioinformatics. 2003;19:1572–1574. [PubMed]
43. Larget B., Simon D. Mol. Biol. Evol. 1999;16:750–759.
44. Rozas J., Sánchez-DelBarrio J. C., Messeguer X., Rozas R. Bioinformatics. 2003;19:2496–2497. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...