• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of plntcellLink to Publisher's site
Plant Cell. May 2008; 20(5): 1199–1216.
PMCID: PMC2438456

Biochemical Networks and Epistasis Shape the Arabidopsis thaliana Metabolome[W]


Genomic approaches have accelerated the study of the quantitative genetics that underlie phenotypic variation. These approaches associate genome-scale analyses such as transcript profiling with targeted phenotypes such as measurements of specific metabolites. Additionally, these approaches can help identify uncharacterized networks or pathways. However, little is known about the genomic architecture underlying data sets such as metabolomics or the potential of such data sets to reveal networks. To describe the genetic regulation of variation in the Arabidopsis thaliana metabolome and test our ability to integrate unknown metabolites into biochemical networks, we conducted a replicated metabolomic analysis on 210 lines of an Arabidopsis population that was previously used for targeted metabolite quantitative trait locus (QTL) and global expression QTL analysis. Metabolic traits were less heritable than the average transcript trait, suggesting that there are differences in the power to detect QTLs between transcript and metabolite traits. We used statistical analysis to identify a large number of metabolite QTLs with moderate phenotypic effects and found frequent epistatic interactions controlling a majority of the variation. The distribution of metabolite QTLs across the genome included 11 QTL clusters; 8 of these clusters were associated in an epistatic network that regulated plant central metabolism. We also generated two de novo biochemical network models from the available data, one of unknown function and the other associated with central plant metabolism.


Understanding the molecular and genetic bases of complex traits like disease resistance, growth, and development is a unifying goal in diverse scientific fields. Genetic variation regulating complex traits in natural populations is largely quantitative and polygenic and can interact with environmental, epigenetic, and other genetic factors (Falconer and Mackay, 1996; Lynch and Walsh, 1998). Quantitative trait mapping, the most common approach to the analysis of complex traits, measures the association of genetic markers with phenotypic variation, delineating quantitative trait loci (QTL) (Liu, 1998; Lynch and Walsh, 1998). Computational and genomic advances have generated increasingly precise QTL maps for a wide array of traits, ranging from development and morphology to metabolism and disease resistance (Kliebenstein et al., 2002b; Lexer et al., 2005; Symonds et al., 2005; Anderson et al., 2006; Keurentjes et al., 2006; Hoffmann and Weeks, 2007; Yagil et al., 2007). However, the molecular bases of many quantitative traits remain unknown despite the long history of QTL identification (Sax, 1923).

Quantitative trait analysis is enhanced by the use of microarray technology to measure global transcript levels in mapping populations and to map expression QTLs (eQTLs) (Jansen and Nap, 2001; Doerge, 2002; Schadt et al., 2003). Whole genome eQTL analysis in multiple organisms has revealed that gene expression traits are highly heritable, with a complex genetic architecture (Brem et al., 2002; Schadt et al., 2003; Morley et al., 2004; Keurentjes et al., 2007; West et al., 2007). These studies found large numbers of both cis- and trans-acting eQTLs, with evidence of nonadditive genetic variation such as epistasis and transgressive segregation as well as genetic variation altering entire transcriptional networks (Kliebenstein et al., 2006; Keurentjes et al., 2007; Potokina et al., 2008). Recent work directly links eQTLs to phenotypic alterations in specific metabolic pathways, highlighting the complexity of interactions between transcript and metabolite variation (Sønderby et al., 2007; Wentzell et al., 2007; Hansen et al., 2008). These analyses suggest significant differences between the organization of genetic regulation of transcripts and metabolites for a specific subset of Arabidopsis thaliana secondary metabolites (Wentzell et al., 2007), but this hypothesis has not been broadly tested within the plant metabolome.

Quantitative genetic analysis in plants has enabled the detailed molecular dissection of several secondary metabolite biosynthetic pathways (Magrath et al., 1993; McMullen et al., 1998; Kliebenstein et al., 2001b, 2001c; Szalma et al., 2005). In addition, broad-spectrum metabolite analyses now allow QTL mapping of an expanded portion of the plant metabolome (Keurentjes et al., 2006; Schauer et al., 2006; Meyer et al., 2007). This approach has confirmed known secondary metabolite QTLs and aided the identification of a new enzymatic step in a known secondary metabolite pathway (Kliebenstein et al., 2001a; Kroymann et al., 2003; Keurentjes et al., 2006). It remains to be tested whether metabolomic analyses can generate hypotheses regarding new pathways or recreate linkages between known metabolic pathways.

Comparative analysis of metabolite and developmental variation suggests an integral link between plant central metabolism and development/physiology, but QTLs for metabolite and developmental traits were not colocalized more than expected by chance (Keurentjes et al., 2006; Meyer et al., 2007). This lack of overlap between known development and metabolite QTLs may indicate that genetic regulation of plant metabolism is more complex than presumed, such that current studies lack sufficient power to detect the majority of metabolite QTLs present in a population. The level of genetic complexity regulating plant metabolism will determine the size of the structured mapping populations necessary for effective QTL analyses and could potentially affect the methodology and interpretation of association mapping studies (Beavis, 1994, 1998; Nordborg et al., 2005; Clark et al., 2007).

To begin describing the genetic regulation of variation in the Arabidopsis metabolome, we conducted metabolomic analyses on the Arabidopsis Bayreuth-0 (Bay) × Shahdara (Sha) recombinant inbred line (RIL) population (Loudet et al., 2002). This population has previously been utilized for targeted metabolite QTL and global eQTL analysis (Loudet et al., 2003; Calenge et al., 2006; Kliebenstein et al., 2006; Wentzell et al., 2007; West et al., 2007). Metabolic traits were less heritable than were global transcript levels, suggesting that metabolite accumulation may be more susceptible to environmental influence. Statistical analysis identified a large number of metabolite QTLs with moderate phenotypic effects, and informed pairwise marker tests showed that epistasis strongly influences the genetic architecture underlying the Arabidopsis metabolome. Eleven QTL clusters influenced more metabolites than expected. Eight of these clusters are associated in an epistatic network that appears to regulate plant central metabolism. Two clusters were associated with previously identified secondary metabolite loci (Kliebenstein et al., 2001b, 2001c). These results show that the genetic architecture underlying the Arabidopsis metabolome is highly complex and governed by numerous epistatic interactions. Interpreting relationships among these QTLs will require the analysis of significantly larger mapping populations. In spite of the limited power available in 210 RILs, we were able to identify two de novo biochemical networks, one of unknown function and the other associated with central plant metabolism.


Metabolite Distribution and Detection

We utilized the University of California Davis Metabolomics Core gas chromatography–time of flight–mass spectrometry (GC-TOF-MS) metabolomics platform to measure metabolite accumulation in the Arabidopsis accessions Bay and Sha, the parents of the Bay × Sha RIL population (Loudet et al., 2002; Nikiforova et al., 2005). This GC-TOF-MS platform is believed to detect predominantly primary metabolites within plant samples, and metabolites are identified based on comparison with reference spectra (Roessner et al., 2001; Meyer et al., 2007; http://fiehnlab.ucdavis.edu/Metabolite-Library-2007). The Bay and Sha parents significantly differed in metabolite accumulation for 61 of 396 metabolites detected in this preliminary experiment (see Supplemental Data Sets 1 and 2 online). Metabolites differing significantly between Bay and Sha included phosphoric acid and a diverse set of amino acids, sugars, and fatty acids. Each parent had the highest levels of roughly equal numbers of metabolites, with Sha showing higher levels of tricarboxylic acid cycle (TCA) and pentose phosphate–associated metabolites, while Bay was typically higher in amino acids and storage sugars (see Supplemental Data Set 2 online). Consistent with a global analysis of variable transcript accumulation in these same parents, the average metabolite heritability as measured using just the Bay and Sha parents was 7% (Figure 1A). The distribution of metabolite and transcript heritabilities was similar using these two parents, with transcripts having a slightly heavier tail (Figure 1A).

Figure 1.
Metabolite and Transcript Level Heritabilities.

To measure metabolomic variation in the progeny from Bay and Sha, four biological replicate samples from each of 210 Bay × Sha RILs were analyzed via a GC-TOF-MS platform. The majority of the metabolites (330) were detected in both parents and >90% of the RILs, while a subset of metabolites were detected in one parent but not the other (40 in Bay only and 28 in Sha only). As found in previous Arabidopsis metabolite QTL studies, a significant number of metabolites (226) was detected in the RILs only (Kliebenstein et al., 2001c, 2002a; Lambrix et al., 2001; Keurentjes et al., 2006) (see Supplemental Figure 1 online). The presence or absence of metabolites was likely caused by a mixture of polymorphisms with qualitative and quantitative effects, as over half of the known metabolites were detected in fewer than half of the RILs. This included pyruvate, salicylic acid, citrulline, and Met, all metabolites with central roles in plant metabolism. These metabolites are likely present in all of the RILs but below the detection threshold in some lines.

In a prior study of two specific Arabidopsis secondary metabolite pathways, average heritability estimates for metabolite traits were less than those for the transcripts encoding the biosynthetic enzymes (Wentzell et al., 2007). Our analysis of heritability for all metabolites detected in both replicate RIL experiments showed that the average heritability was 25% and the highest heritability was 55% (Figure 1B; see Supplemental Data Sets 3 and 4 online; these data sets also include measures of all other estimatable sources of variance for this experiment). This range of metabolite heritability was much lower than the distribution of global transcript heritability, suggesting that these metabolite traits in Arabidopsis were subject to more influence and/or noise from both the internal (physiological) and external plant environment (Figure 1B).

The difference in metabolite heritability, in terms of both qualitative metabolite presence and quantitative metabolite content, between the RILs and parents resulted from transgressive segregation. It increased the level of genetically regulated phenotypic variation within the RILs in comparison with the parents, thus leading to an increase in genetic variance or heritability as measured in the RILs. Transgressive segregation for metabolite presence was manifested in the significant fraction of metabolites found in both parents that were not detected in all RILs as well as in metabolites found in neither parent but detected in a number of the RILs (see Supplemental Figure 1 online). Analysis of the RILs for 330 metabolites found in both parents identified positive and negative transgressive segregation for metabolite accumulation (see Supplemental Figure 2 and Supplemental Data Set 5 online). Positive transgressive segregation was found for 143 metabolites, in which at least half of the RILs had metabolite accumulation more than twice the value of the highest parent and <5% of the RILs had values lower than the lowest parent. One hundred thirty-eight metabolites showed negative transgressive segregation, in which half of the RILs had metabolite accumulation lower than half of the lowest parent and <5% of the RILs accumulated metabolite levels higher than the highest parent. Of the remaining metabolites, the majority showed transgressive segregation in both directions (see Supplemental Data Set 5 online). Thus, Bay and Sha possess significant genetic variation for metabolite accumulation that is not evident in the parental phenotypes.

Metabolomic QTL Location and Effect

Data from 557 metabolites present in the 210 Bay × Sha RILs were used to map QTLs. The composite interval mapping (CIM) algorithm within QTL Cartographer identified 438 QTLs affecting 243 metabolites (Figure 2; see Supplemental Data Set 6 online). This included 77 putatively identified metabolites and 166 unknown compounds (http://fiehnlab.ucdavis.edu/Metabolite-Library-2007). For 315 metabolites, no QTLs were detected using CIM: these were equally split between metabolites detected and not detected in the parents. Metabolic QTLs were not equally distributed across the genome, and we identified 11 regions that contained more metabolite QTLs than expected by random chance (permutation estimated P < 0.05; Figure 2). Five of these metabolite QTL clusters (Met.chromosome.centimorgan [cM]), Met.II.15, Met.II.47, Met.III.04, Met.IV.65, and Met.V.67, colocated with previously identified eQTL hot spots, suggesting a link with transcript variation (West et al., 2007). The remaining loci did not associate with eQTL hot spots. However, colocalization of the AOP and Elong loci with known QTLs that determine aliphatic glucosinolate accumulation, glucosinolate structure, and transcript accumulation for aliphatic glucosinolate biosynthetic genes suggests that eQTL hot spots can be pathway- or network-specific and may not appear within genome-scale analyses (Wentzell et al., 2007).

Figure 2.
Genetic Architecture of Metabolite QTLs across the Arabidopsis Genome.

To further investigate the allelic effects of these metabolite QTL clusters, we utilized analysis of variance (ANOVA) to test the marker closest to the peak of each metabolite QTL cluster for association with metabolite accumulation. We individually analyzed all 557 metabolites detected in the RILs to test whether variation in their accumulation was associated with any of the metabolite QTL clusters. After correction for multiple comparisons, this analysis showed that 372 metabolites associated with at least one QTL hot spot (see Supplemental Data Set 6 online). Increased QTL detection via this approach could be attributed to the inclusion of significant QTLs in the model, thus decreasing error variance and increasing the power to detect QTLs that may have been marginal in the CIM analysis. This approach had crude similarities to multiple-interval mapping approaches. By contrast, the CIM analysis focused on individual loci and did not allow for the error term to be adjusted downward for significant QTLs.

The ANOVA provided two estimates of allelic effect for each significant QTL. First, the genetic r2 estimated the proportion of phenotypic variance attributable to genotype at that specific locus. Second was the allelic substitution effect, which expressed the change in phenotype associated with substitution of one parental allele for the other at the queried locus (Figure 3). In a population possessing a broad range of phenotypic values, allelic substitution effects may be more reflective of the biological impact of a given locus than genetic r2. For example, if a population has a 10-fold range in phenotype, a specific QTL of 10% r2 will have a 1-fold phenotypic difference or 100% allelic substitution effect. Since the phenotypic range for many metabolites in this RIL population was quite large, the majority of detected QTLs had an allelic substitution effect size of 30% or greater despite small r2 values (Figure 3; see Supplemental Data Set 5 online). These low estimates of genetic r2 also suggested that numerous unidentified genetic loci influenced metabolite concentrations in this population.

Figure 3.
Distribution of Metabolomic QTL Effect Sizes.

HIF195 and Genetic Limitation on Detection Power

The power to detect a QTL is predominantly determined by its genetic r2, the amount of phenotypic variance apportioned to a specific QTL within the given population. Within our analysis of the Bay × Sha population, the wide range of metabolite concentrations caused QTLs with strong allelic effects to have low genetic r2 per QTL. This suggested that our analysis may have been underpowered and that numerous QTLs were not detected in this replicated analysis of 210 Bay × Sha RILs. To test this possibility, we utilized a higher power analysis focusing upon a single genomic region to test whether we could identify QTLs that were not observed in the whole genome analysis. To do this, we obtained HIF195, a heterogenous inbred family (HIF) previously used to validate a fructose QTL on chromosome IV in the Bay × Sha RILs (Calenge et al., 2006). The two HIF195 genotypes are isogenic except for a small region on chromosome IV where one line is Bay and the other is Sha (see black box in Figure 2A). These lines allow a specific test of the allelic effect of a substitution at this genomic position.

The analysis of the 210 RILs revealed 10 metabolite QTLs in this region, including the expected fructose QTL, but this region did not exceed the significance threshold delineating metabolite QTL clusters (Figure 2A). We measured metabolite accumulation in the two HIF195 genotypes using a replicated design with the same growth conditions and analytical protocols as for the 210 RIL analysis and measured 297 metabolites. Scrutiny of the 10 metabolites with QTLs identified in the HIF region by genome-wide analysis showed that 8 metabolites differed between the two HIF genotypes at P = 0.05, with 3 of those, including fructose, being significant at P = 0.01 (see Supplemental Data Set 7 online). Expanding the analysis to the 287 other metabolites detected in this experiment identified significant differences in the accumulation of 42 additional metabolites between the two HIF195 genotypes, with an average allelic substitution effect of 28%. These differences were significant at the P = 0.01 level, where we expect three false-positives in 297 tests (64 total metabolites were significant at P = 0.05; see Supplemental Data Set 7 online). Polymorphism(s) in this region of the genome, therefore, altered diverse aspects of primary metabolism (Figure 4). Given the broad impact of this single region, which was not initially identified as a metabolic QTL hot spot, we conclude that recombination in the 210 RILs was likely insufficient to allow the detection of all metabolite QTLs present in this population (Beavis, 1994, 1998). HIF analysis may also increase QTL detection power if HIF195 possesses a genetic background that optimizes alleles present at loci epistatic to those directly tested by HIF195 to increase measured QTL effects. However, because genotypes at nontarget loci are fixed randomly in the HIF, any specific HIF has an equal chance of increasing or decreasing the power to detect significant differences in an epistatic combination. Where resources are available, testing multiple HIFs for a given region would allow the discrimination of genetic background effects.

Figure 4.
Metabolite QTL Hot Spots Alter Primary Metabolic Networks.

Metabolomic QTLs and Metabolic Pathways

To better understand the relationships among metabolite QTL clusters, we clustered metabolites based on QTL position and allelic effect, thus connecting metabolite QTL clusters based on shared regulation of specific metabolites. For instance, the AOP locus altered a predictable set of metabolites, given its role in glucosinolate metabolism, with metabolites such as Met and glucose-1-phosphate (G-1-P) forming a discernible cluster defined by a strong effect of AOP (Figure 2B; see Supplemental Data Set 8 online). Clustering analysis suggested that polymorphisms at the Met.I.42, Met.I.80, Met.II.15, and Met.II.47 loci affected a core set of metabolic pathways (see white box in Figure 2B). The other identified metabolite QTL clusters did not show specific metabolic pathway associations.

To visualize QTL effects, we developed a rough map of central plant metabolism (Mueller et al., 2003; Zhang et al., 2005) and plotted the known metabolites influenced by each QTL cluster (Figure 4). This showed that the previously associated Met.I.42, Met.I.80, Met.II.15, and Met.II.47 loci had global effects on central metabolism (Figures 2A and and4).4). While these four loci altered the accumulation of numerous unknown metabolites, they showed a significant enrichment for known metabolites in central metabolism (χ2 test P < 0.001 for all four QTLs; Figure 4; see Supplemental Data Set 8 online). Most central metabolites showed a similar direction of allelic effect for a given QTL, such that for Met.I.42 and Met.II.15, the Bay allele led to higher accumulation for all of the plotted metabolites, while in Met.II.47, the Sha allele led to elevated metabolite levels (Figure 4). A similar bias was detected for the unknown metabolites (see Supplemental Data Set 4 online). In Met.I.80, the Sha allele led to higher accumulation of all metabolites except for glucose and fructose, which had decreased accumulation (Figure 4). This suggested that a polymorphism in Met.I.80 might have altered the interconversion of glucose-6-phosphate and fructose-6-phosphate into glucose and fructose, respectively. However, none of the Arabidopsis hexose kinases are located in this region, suggesting that Met.I.80 was not caused by a hexose kinase polymorphism.

In contrast with the above four metabolite QTL clusters with primarily central metabolism effects, the other seven clusters showed no significant bias for known or unknown metabolites (χ2 test P > 0.10 for all seven QTLs) nor any significant metabolite groupings suggestive of particular functions (Figure 4; see Supplemental Data Set 8 online). For instance, the AOP locus altered the accumulation of Met and G-1-P, as would be expected given that Met and G-1-P are required for glucosinolate metabolism (Figure 4) (Kliebenstein et al., 2001a, 2001b; Wentzell et al., 2007). However, the AOP QTL was additionally associated with altered aromatic amino acid and fatty acid accumulation. The AOP QTL also altered the accumulation of numerous metabolites of unknown association with the aliphatic glucosinolate pathway. A similar pattern of diffuse network effects on central and unknown metabolites was found for the other metabolite QTL clusters (see Supplemental Data Set 8 online).

Logic Approach to Biochemical Network Generation with QTLs

Combining natural genetic variation with quantitative metabolite analysis enabled the identification and validation of parts of the aliphatic glucosinolate biosynthetic pathway (Magrath et al., 1993; Kliebenstein et al., 2001a, 2001c; Kroymann et al., 2003; Keurentjes et al., 2006; Wentzell et al., 2007). This used predominantly a logic-based method of pathway generation, such that if two metabolites were linked by enzymatic processes, a QTL affecting that process would lead to opposite effects on the two metabolites (see QTLX in Figure 5A) (Magrath et al., 1993; Mithen and Campos, 1996; Kliebenstein et al., 2001c). If the QTL repressed the enzymatic processes converting one metabolite to another, the first metabolite would increase at the cost of the accumulation of the second metabolite (Figure 5A). The reverse should be true if the QTL stimulated the enzymatic processes. We developed a computational approach to this problem and applied the algorithm to the glucosinolate pathway using data from the same 210 Bay × Sha RILs described in this report (Wentzell et al., 2007). This generated a hypothetical pathway that was similar to the known biosynthetic linkages (Figure 5B). One difference was the prediction from the logic-based hypothetical pathway that the AOP locus interconverts 3OHP and Allyl glucosinolates. However, the apparent interconversion was a result of the genetics of the AOP locus, in which two alleles distinguish between the production of the two compounds, due to differential expression of two tandem enzymes (Kliebenstein et al., 2001a). Thus, this logic algorithm approach could generate approximate metabolic pathways that were constrained by the underlying genetics.

Figure 5.
QTL Utilization for Pathway Generation.

To test a broader application of this approach, we applied this logic algorithm to the QTL Cartographer data set generated using 210 Bay × Sha RILs to identify putative biochemical associations between metabolites (Figures 5C and 5D). One metabolic network associated a series of unknown metabolites with the identified compound 4-picolinate. 4-Picolinate is not currently known as a natural metabolite within Arabidopsis and may represent a breakdown product of dihydrodipicolinate or tetrahydrodipicolinate, metabolites required for Lys biosynthesis in Arabidopsis (Sarrobert et al., 2000). However, the lack of association between QTLs for Lys and picolinate fails to support this explanation for its presence (see Supplemental Data Set 6 online). Interestingly, 2-picolinate and modified forms of 4-picolinate can induce defense responses in plants, suggesting a potential biological role for this compound if it is truly produced in planta (Uknes et al., 1992; Zhang et al., 2004; Kim et al., 2006).

The second biochemical network identified also contained both unknown and identified metabolites, including fructose-6-phosphate and sedoheptulose, components of the pentose phosphate pathway (Figure 5D). This suggested that the pentose phosphate pathway participated in this network. The inclusion of shikimate, which is synthesized from erythrose-4-phosphate obtained from the pentose phosphate pathway, in this network supported this hypothesis (Figure 5D). Erythrose-4-phosphate was not a component of the reference metabolomics database, so it is possible that it was one of the unknown compounds. Additionally, the presence of succinate suggested that this biochemical network also involved the TCA cycle. Identification of these two networks suggested that a logic-based approach could derive biochemical networks directly from metabolomics QTL data. However, these associations did not provide direct evidence for the molecular nature of the QTL or the biochemical association.

The AOP and Elong QTLs and Biosynthetic Linkages

In addition to the logic approach to biosynthetic pathway analysis, previous work has shown that epistatic QTLs can provide insight into biochemical relationships between metabolites within a pathway (Mithen and Campos, 1996; Kliebenstein et al., 2001b; Lambrix et al., 2001; Keurentjes et al., 2006; Zhang et al., 2006; Wentzell et al., 2007). These studies focused on epistasis between the AOP and Elong QTLs, in which all detected aliphatic glucosinolates identified an interaction between AOP and Elong (Figure 6). We tested whether it was possible to link metabolites to the glucosinolate biosynthetic pathway based on their regulation by an epistatic interaction between AOP and Elong. We hypothesized that regulation of two metabolites by the same QTL–QTL interaction was less likely to result from chance or false detection than from shared regulation of two metabolites by a single QTL.

Figure 6.
Epistatic Interaction of AOP and Elong.

This epistatic association test identified 31 metabolites whose accumulation was determined by an epistatic interaction between these two loci (Figure 6). The majority of these 31 metabolites were unidentified compounds that share common QTLs with known glucosinolates and showed either positive or negative epistasis between AOP and Elong (Figure 6). This suggested that these metabolites, if not aliphatic glucosinolates themselves, may be intermediates in the biosynthetic pathway. Since some intermediates in this pathway are elongated Met derivatives and related organic acids, they are likely detectable with the GC-TOF-MS platform (Textor et al., 2007). Failure to identify QTLs at the ESP locus (controlling glucosinolate hydrolysis) for these unknown metabolites suggests that they are not hydrolysis products. Interestingly, this genetic approach identified metabolites peripherally associated with glucosinolate synthesis, such as glucose, glucose-1-phosphate, and Gly, as well as metabolites such as uracil and homoglutamine with no obvious linkage to aliphatic glucosinolate metabolism. Some of these metabolite associations could be explained by these additional metabolites having QTLs linked to, but not caused by, AOP and Elong. However, the requirement for epistatic interaction between these hypothetical additional QTLs diminishes the likelihood that this was a random association. As such, known epistatic loci facilitated building hypotheses regarding the association of additional metabolites with glucosinolate biosynthesis. While further work is required to validate the association of these metabolites with aliphatic glucosinolates, the association of unknown metabolites with known biosynthetic pathways enhances the utility of metabolite databases and suggests approaches to determining a biological role for these unknown metabolites.

Pairwise Epistasis

QTL studies focused on the accumulation of specific metabolites in Arabidopsis frequently identify epistatic interactions between QTLs with significant main effects (Kliebenstein et al., 2001b, 2002a; Loudet et al., 2003; Calenge et al., 2006; Pfalz et al., 2007). To test whether the global metabolome showed a similar enrichment in epistatic interactions, we formally tested all 55 pairwise epistatic interactions between the 11 metabolite QTL clusters against the average accumulation of 557 metabolites within the RILs. This informed model used less than one-third of the available degrees of freedom. Using a false discovery rate (FDR)–adjusted P value (<0.05), 240 metabolites were identified with between one and five significant epistatic interactions, for a total of 328 significant interactions (FDR < 0.1 identified 497 metabolites with 1538 significant interactions; see Supplemental Data Sets 9 and 10 online). For the majority of metabolites, the significant epistatic terms explained as much or more of the genetic variance, in terms of genetic r2, as the combined main effect QTLs (Figure 7A). However, individual main effect QTLs showed larger effects than did the individual epistatic terms (Figure 7B). This shows that epistasis was widely prevalent in the determination of metabolite levels within this Arabidopsis RIL population. We did not conduct an unsupervised search of all possible pairwise epistatic interactions, as the use of only 210 RILs did not provide sufficient degrees of freedom.

Figure 7.
Comparison of Main QTL versus Epistatic Effects.

Pairwise Epistasis and QTL Association

Research on mutants in yeast metabolic genes identified epistasis more frequently between loci that function within the same or closely related metabolic pathways (Segre et al., 2005). As such, QTLs whose pairwise epistatic interactions regulate an enhanced number of metabolites may regulate a metabolic network. Because the pairwise marker comparisons have a broad range of two-way allele frequencies, we used a permutation approach to identify pairwise interactions regulating more metabolites than expected. This identified 9 of the 55 possible pairwise epistatic interactions as affecting more metabolites than would be expected by chance (Figure 8). The overrepresented pairwise marker interactions connected 8 of the 11 identified metabolite QTL clusters. The Met.II.47 QTL was the main hub of this network, connecting directly to five other metabolite QTL clusters and apparently altering the accumulation of metabolites associated with the TCA cycle and sulfur metabolism. Interestingly, the interaction between Met.II.47 and Met.I.16 altered the accumulation of nearly all fatty acids (Figure 8B). This suggests that the often overlooked effect of epistasis can be utilized to aid in network generation with large genomic data sets. The epistatic interaction between AOP and Elong, while previously experimentally validated, did not exceed the permutation-derived significance threshold, suggesting that our permutation approach to determine significance level left a number of false-negatives and that the observed epistasis network was likely conservative.

Figure 8.
Epistatic Metabolite QTL Networks.

For each interaction, we estimated a measure of epistasis (epsilon). A positive value of epsilon indicated that recombinant genotypes had less metabolite than the parental genotypes; negative epsilon indicated elevated metabolite content in recombinant genotypes (Figure 8). The AOP × Elong interaction displayed a mixture of positive and negative epistasis. Such a mixture of epistatic effects on a set of metabolites suggested that the polymorphisms creating these epistatic QTLs were located within the biosynthetic pathway. We applied this interpretation to other, less characterized epistatic interactions. In the Met.II.47 × Met.V.67 interaction, positive epistasis only affected fatty acid accumulation, while negative epistasis affected all other measures of central metabolism (see Supplemental Figure 3F online). This suggested that Met.II.47 × Met.V.67 interacted to regulate the partitioning of carbon between fatty acids and other components of central carbon metabolism. Similarly, in the Met.I.80 × Met.II.47 interaction, the sign of epistasis changed between short- and long-chain fatty acids, suggesting that this interaction may regulate fatty acid elongation (see Supplemental Figure 3E online). These analyses suggested that the Met.II.47 QTL affects the partitioning of carbon within the plant and may be a major regulator of Arabidopsis metabolism. The remaining pairwise marker interactions showed exclusively positive (see Supplemental Figures 3A and 3 B online) or negative (Figure 8B; see Supplemental Figures 3B and 3G online) epistasis. All together, this analysis showed that considering the sign of epistatic interactions can enhance the interpretation of metabolite QTL effects.

Three-Way Epistasis

Most studies of plant quantitative traits are limited to the analysis of pairwise epistasis, but this lack of information does not preclude the potential for higher order interactions. Analysis of the main effects of three central metabolism QTLs, Met.I.42, Met.II.15, and Met.II.47 (Figure 2), and their pairwise interactions showed that these QTLs may coordinate central metabolism through metabolites associated with the TCA cycle (Figure 8; see Supplemental Figure 3 online). We used an ANOVA model to test whether TCA metabolites identify a three-way epistatic interaction between Met.I.42, Met.II.15, and Met.II.47. Five of the eight TCA metabolites that biosynthetically link citrate to Gln showed significant three-way epistasis between Met.I.42, Met.II.15, and Met.II.47 (Figure 9). Interestingly, a single recombinant genotype had elevated content of all five metabolites (Figure 9). All individuals in this genotypic class, Met.I.42BAY, Met.II.15BAY, and Met.II.47SHA, showed a 250 to 400% increase in the accumulation of TCA metabolites in comparison with the average RIL (Figure 9). This large epistatic allelic effect contrasted with the average 30% allelic effect on these metabolites estimated for Met.I.42, Met.II.15, and Met.II.47 as main-effect QTLs. The biosynthetic linkage between these metabolites suggests that polymorphisms at these three loci interact to regulate carbon accumulation in the TCA cycle.

Figure 9.
Three-Way Epistasis and the TCA Cycle.

Including the other central metabolism QTL, Met.I.80, in the model did not affect this result, nor was a three-way interaction found with any two of these three QTLs and the Met.I.80 locus for the TCA metabolites. A comparison with leaf area also showed that this three-way epistasis caused a severe limitation in plant growth, such that the class with the maximal TCA accumulation had a nearly 60% reduction in growth. Given the statistical limitation of the number of lines and available recombination, we did not conduct an unsupervised screen for all possible three-way epistatic interactions with all metabolites.


This study identified numerous QTLs regulating plant metabolism (Figure 2). The average metabolite's accumulation was affected by multiple QTLs of moderate effect with epistatic interactions, such that the contribution of pairwise epistatic effects frequently equaled those of the main-effect QTLs. The detection of at least one higher order (three-way) epistatic interaction implies that fully querying this complexity will require larger mapping populations than are currently available (Figure 9). Identification of the molecular bases of these metabolic QTLs, especially QTLs controlling epistatic interactions, will provide useful potential to modify plant central metabolism. This will also improve our understanding of the molecular basis of epistasis as well as the integration of plant central metabolism with development and physiology. Given this estimated complexity, the medium-sized population (210 RILs) utilized for this study likely revealed only a fraction of the natural variation modulating metabolic diversity between the Bay and Sha accessions, let alone within Arabidopsis as a species. Thus, the genetic architecture regulating the Arabidopsis metabolome is highly complex.

The identified metabolite QTLs formed 11 clusters (Figure 2). Four of these clusters directly associated with central metabolism (Figure 4), and four different clusters epistatically associated with central metabolism (Figure 8). Two additional clusters, AOP and Elong, were associated with secondary metabolism, specifically aliphatic glucosinolate metabolism (Figure 6) (Kliebenstein et al., 2001b, 2001c; Wentzell et al., 2007). This contrasts with a previous report that three of four metabolite clusters detected in Arabidopsis, including the AOP and Elong loci, were secondary metabolite–associated (Keurentjes et al., 2006). This difference is likely caused by platform-specific detection bias, as the GC-TOF-MS platform detects predominantly primary metabolites in addition to numerous unknown metabolites (our study), while the liquid chromatography–mass spectrometry platform detects numerous secondary metabolites (Fiehn, 2002; Keurentjes et al., 2006).

Metabolic Pathway versus Physiology

Central metabolism QTL clusters frequently displayed directional bias in their allelic effects, such that all metabolites were either increased or decreased for a given genotype (Figure 4). Simple enzyme polymorphisms affecting the conversion of one metabolite to another would lead to differential allelic effects upon metabolites, depending on the relative positions of the enzyme and metabolites within the pathway (Figures 5 and and6).6). The unidirectionality of allelic effects for the central metabolism QTL clusters suggests a more complex basis for these QTLs. Several of the identified metabolite QTL clusters also associate with major gene expression polymorphisms (West et al., 2007). These pleiotropic QTLs may regulate developmental or physiological differences in the plant that subsequently alter central metabolism. Alternatively, the effect of the QTLs upon central metabolism might indirectly alter development or physiology. An association between central metabolism and physiology/development has been found in other Arabidopsis RIL studies. In two studies, a major metabolite QTL cluster associated with an experimentally induced polymorphism in the developmental regulator protein ERECTA (Keurentjes et al., 2006; Meyer et al., 2007). Explaining the bases of central metabolism QTL clusters will require detailed analyses of physiological and developmental traits in the Bay × Sha population. At this stage, the association between growth and central metabolism is purely correlational and does not facilitate speculation regarding the specific direction of the causal relationships between metabolic polymorphisms and development/physiology polymorphisms. Determining the direction of these relationships will require cloning the underlying loci and validating their molecular functions. These relationships, however, do suggest that RIL analysis directed toward specific analysis of metabolomic variation may be most successful in populations that do not segregate for variation in genes known to regulate developmental or physiological phenotypes, such as flowering time. Eliminating known developmental variation would reduce the effects of developmental or physiological variation on the metabolome, but it would not eliminate the potential for metabolic variation to alter development or physiology.

Generating Metabolic Pathways via Logic Algorithms

Previous research suggested that natural metabolic variation may be useful in predicting and cloning genes in metabolic pathways (Kliebenstein et al., 2001b, 2001c, 2002b; Lambrix et al., 2001; Kroymann et al., 2003; Benderoth et al., 2006; Keurentjes et al., 2006; Wentzell et al., 2007). Within our QTL data set, unknown metabolites were sometimes paired such that one unknown was positively affected by a QTL while the other was negatively affected by the same QTL (see Supplemental Data Set 6 online). Application of a logic-based approach to the QTL data generated two putative biochemical networks (Figure 5). One network involved the pentose phosphate and TCA cycles but contained a number of unidentified metabolites. These unknown metabolites may be intermediates in these pathways or altered forms of known metabolites. The second network centered on 4-picolinate, a compound that has not been detected previously within Arabidopsis. As such, further work is required to determine whether this is a breakdown product of Lys biosynthetic intermediates or it represents a new pathway for a previously undetected metabolite.

The arrows linking metabolites in models generated by this approach do not imply a single enzymatic step between these metabolites. Nor do they imply that the QTL is caused by polymorphism within a single enzyme. As found for glucosinolates, some major QTLs affect individual enzymes and also alter the expression of the entire biosynthetic pathway (Wentzell et al., 2007). Connection of metabolites by this approach indicates only their potential relationship via enzymatic interconversion. This relationship may be mediated by a single biosynthetic enzyme, an entire pathway, or, as found with the AOP locus, opposing reactions using the same unmeasured precursor (Figure 5B). While specific relationships between metabolites are not revealed, this approach allows the generation of new metabolic hypotheses and provides an ability to link unknown metabolites with known biosynthetic pathways through metabolomic QTLs. Increasing the power of QTL detection and mapping precision could greatly expand our ability to generate networks and explore their relationships.

Glucosinolate Bias?

Research suggesting the utility of genomics and QTLs for pathway generation has largely focused on the aliphatic glucosinolate biosynthetic pathway within Arabidopsis (Kliebenstein et al., 2001a, 2001c; Hirai et al., 2005, 2007; Keurentjes et al., 2006; Hansen et al., 2007; Sønderby et al., 2007). Recent results show altered sequence and gene expression polymorphism patterns in this pathway, indicative of selective pressures driving the formation and maintenance of structural diversity (Kliebenstein et al., 2001b, 2001c, 2002b; Lambrix et al., 2001; Kroymann et al., 2003; Benderoth et al., 2006; Keurentjes et al., 2006; Wentzell et al., 2007; Kliebenstein, 2008). Closer analysis of glucosinolate-associated unknown metabolites showed that they had significantly elevated heritability in comparison with the rest of the metabolites (glucosinolate metabolite heritability = 35%, metabolomic average = 25%; P < 0.001), providing greater detection power. Within Arabidopsis, this combination of high QTL detection power and diversifying selection may be unique to glucosinolate metabolites. Nevertheless, it appears that our logic approach to network generation can be extended to other metabolites, including unknowns (Figure 5). Other plant species may possess at least one secondary metabolite pathway with similar genetic characteristics to the glucosinolate pathway, potentially allowing these specialized pathways to be rapidly identified and characterized via similar genomics approaches.

Genetic Power

QTL studies frequently identify one or two large-effect QTLs per trait, along with a suite of small-effect QTLs. It has been proposed that creating a large number of small structured populations (<200 lines per parental pair) from a diverse set of germplasm will allow the characterization of most large-effect QTLs in a species. Our data, however, suggest that the average allelic effect is less limiting for QTL detection than is the number of recombination events (Beavis, 1994, 1998). Several metabolite QTL clusters were located <30 cM apart, diminishing our ability to discern their independent effects on the metabolome (Figure 2). The total genetic r2 explained by main-effect and epistatic QTLs was on average <30%, suggesting the contribution of multiple unidentified genetic loci with moderate phenotypic effects, as found in a recent yeast eQTL study (Brem and Kruglyak, 2005). Strikingly, focused analysis of the HIF195 genotypes identified a large number of new metabolite QTLs not found in the 210 RILs, with most showing moderate (20 to 50%) allelic substitution effects (see Supplemental Data Set 8 online). The knowledge that this locus, previously identified as a fructose QTL (Calenge et al., 2006), alters additional metabolites may aid in the identification of candidate genes with the potential to influence multiple metabolites.

The prevalence of pairwise and potentially higher order epistases also diminishes QTL detection power in small to medium populations. For instance, testing for a pairwise epistatic interaction splits the population into four genotypic classes versus only two groups for a main-effect QTL. Fewer measurements per genotypic class decrease statistical power. The potential impact of epistasis is magnified by the existence of three-way epistasis, such as found in this study (Figure 9). Reanalysis of total aliphatic glucosinolate data from 403 Bay × Sha RILs identified three different three-way interactions influencing >20% of trait variation (P < 0.001 for AOP × Elong × MYB28, AOP × Elong × GSL.ALIPH.III.15, and AOP × GSL.ALIPH.III.15 × MYB28), while no three-way interactions were detected in a previous study using a 210 RIL subset of the same population (Wentzell et al., 2007). This supports our hypothesis that higher order epistasis is common in the Arabidopsis metabolome and highlights the need for larger mapping populations to detect these interactions. This high level of epistasis may be a consequence of the interconvertability of metabolites within the metabolome, suggesting that epistasis might be of greater consequence in metabolomics than transcriptomics.

Metabolite versus Transcript Heritabilities

Previous work had shown that for glucosinolates, the average metabolite had a lower heritability than did the transcripts encoding the biosynthetic pathway (Wentzell et al., 2007). This GC-TOF-MS metabolomics data set shows that the Arabidopsis metabolome as measured in the Bay × Sha RILs shows lower heritability than the transcriptome (Figure 1). Published estimates of technical error for the GC-TOF-MS and Affymetrix microarray platforms suggest that differences in the level of phenotypic variance explained are not caused predominantly by differences in measurement error between these two technologies (Weckwerth et al., 2004b). Additionally, heritability for the putative glucosinolate-related metabolites detected by the GC-TOF-MS platform (30%; Figure 6) was similar to the heritability for the glucosinolates in these same lines as measured by HPLC in a larger experiment using this same population (40%; Wentzell et al., 2007), which further supports the notion that this difference is not purely technical.

Differences in the genetic regulation of these two classes of trait may be explained by the simple fact that transcripts are functionally linked to potential DNA polymorphisms in their genes or regulators and, as such, have less potential for the introduction of biological noise between genetic variation and the measured transcript variation. In comparison, the measured metabolite variation requires a DNA polymorphism to be processed via transcription and translation. This extra step may introduce more biological noise. Another explanation for the difference in heritability between transcripts and metabolites is that metabolites are often interrelated, such that the atoms in one metabolite can potentially be transferred to a different metabolite in relatively few steps (Fell and Wagner, 2000; Jeong et al., 2000; Arita, 2004). This metabolic interconnectedness may increase the effect of small biological perturbations, introducing more noise in metabolic networks than in corresponding transcripts. These possibilities could be tested in a replicated experiment measuring global transcripts and metabolites in a structured population in the presence and absence of a controlled environmental perturbation (Van Leeuwen et al., 2007). If the metabolic network amplifies biological noise, then the genotype × environment interaction term should be greater for metabolites than for global transcripts.


Plant Material and Experimental Conditions

Seeds for Arabidopsis thaliana accessions Bay, Sha, and the F8 generation of a Bay × Sha RIL population (Loudet et al., 2002) were obtained from The Arabidopsis Information Resource (stock number CS57920; www.arabidopsis.org). The advanced generation RILs were created by single-seed descent from the F2 generation (Loudet et al., 2002). The RIL (F8) plants and parental accessions were grown in a single growth chamber and allowed to self-pollinate; sufficient seed was harvested from individual plants to conduct our replicated experiments as well as a previous microarray experiment utilized to map eQTL (Kliebenstein et al., 2006; West et al., 2007).

We conducted two independent measurements of the Bay × Sha RIL metabolites, separated by 2 months. Within each experiment, two plants for each of 210 RILs, plus parental controls, Bay and Sha, were grown using the growth chambers and conditions previously utilized for seed production and eQTL studies (West et al., 2006). Briefly, seeds were sown directly onto Premier ProMix B potting soil (Premier Brands) in 36-cell (~125 cm3 soil volume per cell) flats and grown in controlled-environment chambers at 20°C with 8 h of light at 100 to 120 μE. Within an experiment, two randomized blocks each contained one plant per RIL. Prior to planting, seeds were imbibed and cold-stratified at 4°C for 3 d to break dormancy. Seedlings were thinned to one plant per cell at 7 d after planting. At 6 weeks after germination, the plants were harvested and extracted for metabolomics analysis.

Metabolomics Analysis

For each metabolomics sample, one leaf disc from each of two leaves per plant was harvested, providing two leaf discs of ~20 mg total weight. All harvesting started at subjective midday, finishing within 2 h. The harvest order of the four blocks was randomized to minimize circadian effects. Each plant was independently harvested and extracted as per published protocols, providing a total of 840 samples, 24 for parental genotypes and 816 for the RILs (Weckwerth et al., 2004a, 2004b; Meyer et al., 2007). The samples were stored dry at −80°C until automated derivatization and GC-TOF-MS analysis at the University of California Davis Genome Center Metabolomics Facility (http://metabolomics-core.ucdavis.edu/; Fiehn et al., 2005). Metabolite identity was determined by comparing retention time and mass with the University of California Davis Genome Center Metabolomics Facility metabolites database (http://fiehnlab.ucdavis.edu/Metabolite-Library-2007/; Fiehn et al., 2005). This library contains reference spectra for 713 known metabolites, generated by the analysis of purified reference compounds. Metabolites not contained within this library are listed as unknown or unidentified metabolites. This analysis identified 636 putative metabolites within the entire experiment (see Supplemental Data Set 1 online). After analysis, samples were subjected to quality control by further analysis of samples with internal standard values of >3 σ from the experimental mean within a given experiment, suggesting faulty derivatization or GC-TOF-MS analysis of that specific sample. Given the number of samples, we chose 3 σ rather than 2 σ to exclude only extreme (biologically improbable) outlier samples. Because we had 210 lines with replication, the inclusion of 2 σ outliers should have minimally affected the overall outcome. Excluding real 2 σ outliers (caused by biology, not error) could significantly diminish the power of the experiment. Outlier samples were then tested for poor extraction quality by screening for samples with extreme (>3 σ) outlier values for the average accumulation of amino acids and TCA metabolites. The flagged samples were distributed randomly among the RILs, suggesting that they were outliers due to technical error rather than genetic segregation. Removal of these samples from our analysis left a total of 769 metabolite measurements across 210 lines.

Estimation of Heritability

We calculated estimates of broad-sense heritability (H) for each metabolite as H = σ2g2p, where σ2g was the estimated metabolite level genetic variance among different genotypes in this sample of 210 RILs or two parental genotypes and σ2p was the phenotypic variance for a trait (Liu, 1998). To estimate heritability between the Bay and Sha parental genotypes, we utilized the 371 metabolites detected in the parents in both independent RIL experiments plus an additional 26 metabolites that were detected in common between at least one of the two RIL experiments and a third parental experiment (see Supplemental Data Set 2 online). This data set contained 18 biological replicates per parent (Bay and Sha): 6 replicates of each parent grown concurrently with each RIL experiment and 6 replicates of each parent in a third experiment containing just the parents.

Heritability within the RILs was estimated for the 371 metabolites detected in both independent RIL experiments (see Supplemental Data Set 3 online). This data set included four biological replicates for each of the 210 RILs, permitting the estimation of heritability for each metabolite within the RIL population. Similar information is provided for those metabolites identified in only one of the two replicates, but these were not used for heritability comparisons (see Supplemental Data Set 4 online). Metabolite H was compared with previously estimated heritability values (West et al., 2007) for all transcripts using the same 210 lines.

QTL Analysis

The Bay × Sha RIL population has been genotyped previously (Loudet et al., 2002), and additional markers were obtained from the expression QTL analysis (West et al., 2006) as well as for the GSL-AOP and GSL-Elong loci (Kliebenstein et al., 2001b, 2002b; Textor et al., 2004; Benderoth et al., 2006; Wentzell et al., 2007). To detect metabolomic QTLs, we utilized the average metabolite accumulation per RIL across all experiments (see Supplemental Data Set 5 online). We tested only the 557 metabolites present in at least 5% of the RILs for QTLs. Seventy-five percent of those metabolites were detected in at least 25% of the RILs, with 232 of the metabolites detected in >90% of the RILs. There was a slight positive but insignificant correlation between the number of RILs in which a metabolite occurred and the number of QTLs identified for that metabolite (P = 0.08, r2 = 0.04), suggesting that including metabolites present in a minority of RILs did not introduce significant bias.

The average per RIL accumulation for each metabolite was used to map QTLs within Windows QTL Cartographer version 2.5 (Basten et al., 1999; Zeng et al., 1999; Wang et al., 2006). CIM was implemented using Zmap (model 6) with a 10-cM window and an interval-mapping increment of 1 cM. Forward regression was used to identify five cofactors per metabolite trait. The declaration of statistically significant QTLs was based on permutation-derived empirical thresholds using 1000 permutations for each trait mapped (Churchill and Doerge, 1994; Doerge and Churchill, 1996). CIM using permutations to assign significance based on the underlying trait distribution is robust at handling normal or nearly normal trait distributions (Rebai, 1997), as found for most of our metabolites. The Eqtl module of QTL Cartographer identified the location, additive effect, and 1-logarithm of odds interval of each significant QTL for each trait (see Supplemental Data Set 6 online) (Wang et al., 2006).

QTL clusters were identified using a QTL summation approach in which the position of each QTL for each trait × experiment combination was indicated by a 1 and the number of traits regulated by a QTL at a given position was summed (Kliebenstein et al., 2006). This summation was conducted within a 10-cM sliding window. Permutation analysis with 1000 permutations determined a cluster size of 12 metabolite QTLs as the threshold for cluster significance (α = 0.05) (West et al., 2007). The QTL clusters identified defined genetic positions that were named with respect to their positions using the Arabidopsis Genome Initiative code of the marker closest to the peak. The last three digits of the Arabidopsis Genome Initiative code were removed to avoid confusion with the specific gene and to highlight that it was an approximate position. The QTLs detected at the previously characterized and cloned AOP and Elong loci were named as such (Magrath et al., 1994; Kliebenstein et al., 2001a, 2001b; Kroymann et al., 2003).

Metabolite QTL Hot Spot ANOVA

To test directly the effect of each identified QTL cluster on all metabolites, we used an ANOVA model containing the markers most closely associated with each of the 11 significant metabolite QTL clusters as individual main-effect terms. For each metabolite, the average accumulation in lines of genotype g at marker m was shown as ygm. The model for each metabolite was as follows:

equation M1

where g = Bay or Sha and m = 1, …,11. The main effect of the 11 markers was denoted as M. The error, epsilongm, was assumed to be normally distributed, with mean 0 and variance σepsilon2. An automated SAS script tested all metabolites within the model and returned all P values, type III sums-of-squares for the complete model and each main effect, and QTL main-effect estimates (in terms of allelic substitution values) (see Supplemental Data Set 8 online). Significance values were corrected for multiple testing within each model using FDR < 0.05 in the automated script.

Logic Algorithm for Pathway Generation

To identify putative biochemical associations between metabolites, we utilized a logic algorithm. This logic algorithm first compartmentalized all metabolites into bins based on the metabolites having QTL peaks no farther apart than 2 cM, based on the QTL Cartographer output (see Supplemental Data Set 6 online). Assuming that a single locus is affecting the metabolites at a given QTL, that the metabolites are measured in the same samples, and that all statistics are performed with the same marker genotypes, the QTL peaks caused by a single locus should be very tightly linked. The bins were then tested for the presence of metabolites with allelic effects in the opposite direction, and only those containing metabolites with opposite allelic effects were kept. We then required the average absolute value of the allelic substation effect within a bin to be >25%, using the glucosinolate QTLs as guides to set this threshold. Maintaining the threshold at this level allowed us to focus on the larger effect QTLs. Omitting this threshold led to the presence of more bins and a more complicated biochemical network but did not destroy the presented networks. This generated a list of potential biochemical associations between metabolites (see Supplemental Data Set 6 online). To generate pathways, we identified metabolites present in multiple bins and used them to link metabolites into potential biochemical pathways. All QTL information used to generate these putative networks was statistically validated as described in the QTL section. This approach provided a putative network that allowed the inclusion of known and unknown metabolites.

Metabolite QTL Epistasis Analysis

To test directly for epistatic interactions between the metabolite QTL clusters, we conducted an ANOVA using a model containing all 11 metabolite QTL clusters as individual main-effect terms using the markers most closely associated with each significant QTL cluster. Additionally, we tested all 55 possible pairwise interactions between these markers. For each metabolite, the average accumulation in lines of genotype g at marker m was shown as ygm. The model for each metabolite was as follows:

equation M2

where g = Bay or Sha, m = 1, …,11, and n was the identity of the second marker for an interaction. The main effect of the 11 markers was denoted as M. The error, epsilongmn, was assumed to be normally distributed with mean 0 and variance σepsilon2. An automated SAS script was developed to test all metabolites within this ANOVA and return all P values, type III sums-of-squares for the complete model and each individual term, and QTL pairwise effect estimates in terms of allelic substitution values (see Supplemental Data Sets 9 and 10 online). Pairwise interactions were defined as having a significant epistatic cluster if they affected >5% of the tested metabolites. Significance values were corrected for multiple testing within a model using FDR < 0.05 in the automated script.

To quantify the epistatic interactions, we adapted an approach utilized to quantify epistatic interactions altering fitness (Elena and Lenski, 1997; Segre et al., 2005). For fitness, nonscaled epistasis is measured as epsilon = WXYWXWY, where WX and WY are the fitness values of the single mutants and WXY is the fitness of the double mutant. Fitness was measured as the ratio of growth between the mutant genotype and the wild type. We adapted this measure of epistasis to quantify the epistasis for each metabolite's accumulation as regulated by two QTLs, X and Y, so that epsilon = MXYMXMY. MX represented the ratio of a metabolite's average accumulation in the RILs containing a Sha allele at QTL X and a Bay allele at QTL Y to the average metabolite accumulation in the RILs containing a Bay allele at both QTLs. Conversely, MY was the ratio of a metabolite's average accumulation in the RILs containing a Bay allele at QTL Y and a Bay allele at QTL X to that metabolite's average accumulation in the RILs containing a Bay allele at both QTLs. MXY was the ratio of a metabolite's average accumulation in the RILs containing a Sha allele at both QTLs to the metabolite's average accumulation in the RILs containing a Bay allele at both QTLs. The sign of epsilon was independent of which parental genotype was used in the denominator. Negative values of epsilon indicated negative epistasis, whereby the recombinant progeny had lower metabolite accumulation than the parental genotypes. Positive epsilon showed positive epistasis, whereby the recombinant progeny had greater metabolite accumulation than the parental genotypes (Elena and Lenski, 1997; Segre et al., 2005).

To test the three main TCA cycle QTLs for a three-way epistatic interaction regulating the TCA cycle, we conducted an ANOVA using a model containing all 11 main-effect terms, all 55 pairwise interactions, and a single three-way epistatic term. An automated SAS script then individually tested the eight detected metabolites central to the TCA cycle against the combined model. The script returned all P values, type III sums-of-squares for the complete model and each individual term, and means for the eight genotypic classes in the three-way epistatic interaction. Significance values were corrected for multiple testing within a model using FDR < 0.05 in the automated script. Analyzing all other metabolites with the same model detected no additional metabolites showing this significant three-way interaction.

HIF195 Analysis

The Bay and Sha lines for HIF195 (http://dbsgap.versailles.inra.fr/vnat/) had been developed previously and utilized to validate a fructose QTL (http://dbsgap.versailles.inra.fr/portail/; Calenge et al., 2006). We obtained six independent maternal seed sets for each genotype to control for maternal effects on metabolite accumulation. The lines were grown and analyzed for metabolite accumulation as described above, in a randomized complete block design with two blocks and two plants per line. Metabolites were measured for 24 samples per genotype. We used ANOVA to test for differences between the Bay and Sha genotypes of HIF195. For each metabolite, the accumulation in genotype g from line l in block j was denoted as yglj. The model for each metabolite was as follows: yglj = μ+ Gg + G(L)glj + epsilonglj, where g = Bay or Sha, l = 1, …,6, and j = 1, 2. The main effects, G and L, represented genotype and maternal line per genotype, respectively. The error, epsilonglj, was assumed to be normally distributed, with mean 0 and variance σepsilon2. An automated SAS script individually tested all metabolites for altered accumulation between the HIF195 genotypes. The script returned all P values, type III sums-of-squares for the complete model and each individual term, and means for the two HIF195 genotypes (see Supplemental Data Set 7 online).

Supplemental Data

The following materials are available in the online version of this article.

  • Supplemental Figure 1. Metabolite Frequency.
  • Supplemental Figure 2. Positive and Negative Transgressive Segregation.
  • Supplemental Figure 3. Epistatic Metabolite QTL Networks.
  • Supplemental Figure 4. Genetic Architecture of Metabolomic QTLs across the Arabidopsis Chromosomes.
  • Supplemental Data Set 1. Metabolites Detected in This Research.
  • Supplemental Data Set 2. Parental Genotype Metabolite Analysis.
  • Supplemental Data Set 3. ANOVA for Metabolites Detected in Both Replicated Experiments.
  • Supplemental Data Set 4. ANOVA for Metabolites Detected in Only One Replicate of the RILs.
  • Supplemental Data Set 5. RIL Metabolite Values.
  • Supplemental Data Set 6. Metabolite QTL Information.
  • Supplemental Data Set 7. HIF195 Analysis.
  • Supplemental Data Set 8. Single-Marker QTL Analysis by ANOVA.
  • Supplemental Data Set 9. Epistatic ANOVA P Values.
  • Supplemental Data Set 10. Epistatic ANOVA Type III Sums-of-Squares.

Supplementary Material

[Supplemental Data]


We thank Ida Elken Sønderby and Adam Wentzell for reviewing the manuscript and Katherine Pollard for statistical insight. Funding for this project was obtained from National Science Foundation Grant DBI 0642481 to D.J.K. The Danish National Research Foundation is acknowledged for its support to the Center for Molecular Plant Physiology. B.G.H. acknowledges the Graduate School in Biotechnology for a PhD stipend.


The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: Daniel J. Kliebenstein (ude.sivadcu@nietsnebeilk).

[W]Online version contains Web-only data.



  • Anderson, J.R., Schneider, J.R., Grimstad, P.R., and Severson, D.W. (2006). Identification of quantitative trait loci for larval morphological traits in interspecific hybrids of Ochlerotatus triseriatus and Ochlerotatus hendersoni (Diptera: Culicidae). Genetica 127 163–175. [PubMed]
  • Arita, M. (2004). The metabolic world of Escherichia coli is not small. Proc. Natl. Acad. Sci. USA 101 1543–1547. [PMC free article] [PubMed]
  • Basten, C.J., Weir, B.S., and Zeng, Z.-B. (1999). QTL Cartographer, Version 1.13. (Raleigh, NC: Department of Statistics, North Carolina State University).
  • Beavis, W.D. (1994). The power and deceit of QTL experiments: Lessons from comparative QTL studies. In Proceedings of the Forty-Ninth Annual Corn & Sorghum Industry Research Conference (Washington, DC: American Seed Trade Association), pp. 250–266.
  • Beavis, W.D. (1998). QTL analyses: Power, precision, and accuracy. In Molecular Dissection of Complex Traits, A.H. Paterson, ed (New York: CRC Press), pp. 145–162.
  • Benderoth, M., Textor, S., Windsor, A.J., Mitchell-Olds, T., Gershenzon, J., and Kroymann, J. (2006). Positive selection driving diversification in plant secondary metabolism. Proc. Natl. Acad. Sci. USA 103 9118–9123. [PMC free article] [PubMed]
  • Brem, R.B., and Kruglyak, L. (2005). The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA 102 1572–1577. [PMC free article] [PubMed]
  • Brem, R.B., Yvert, G., Clinton, R., and Kruglyak, L. (2002). Genetic dissection of transcriptional regulation in budding yeast. Science 296 752–755. [PubMed]
  • Calenge, F., Saliba-Colombani, V., Mahieu, S., Loudet, O., Daniel-Vedele, F., and Krapp, A. (2006). Natural variation for carbohydrate content in Arabidopsis. Interaction with complex traits dissected by quantitative genetics. Plant Physiol. 141 1630–1643. [PMC free article] [PubMed]
  • Churchill, G.A., and Doerge, R.W. (1994). Empirical threshold values for quantitative trait mapping. Genetics 138 963–971. [PMC free article] [PubMed]
  • Clark, R.M., et al. (2007). Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317 338–342. [PubMed]
  • Doerge, R.W. (2002). Mapping and analysis of quantitative trait loci in experimental populations. Nat. Rev. Genet. 3 43–52. [PubMed]
  • Doerge, R.W., and Churchill, G.A. (1996). Permutation tests for multiple loci affecting a quantitative character. Genetics 142 285–294. [PMC free article] [PubMed]
  • Elena, S.F., and Lenski, R.E. (1997). Test of synergistic interactions among deleterious mutations in bacteria. Nature 390 395–398. [PubMed]
  • Falconer, D.S., and Mackay, T.F.C. (1996). Introduction to Quantitative Genetics. (Essex, UK: Longman, Harlow).
  • Fell, D.A., and Wagner, A. (2000). The small world of metabolism. Nat. Biotechnol. 18 1121–1122. [PubMed]
  • Fiehn, O. (2002). Metabolomics—The link between genotypes and phenotypes. Plant Mol. Biol. 48 155–171. [PubMed]
  • Fiehn, O., Wohlgemuth, G., and Scholz, M. (2005). Setup and annotation of metabolomic experiments by integrating biological and mass spectrometric metadata. Proc. Lect. Notes Bioinformatics 3615 224–239.
  • Hansen, B.G., Halkier, B.A., and Kliebenstein, D.J. (2008). Identifying the molecular basis of QTLs: eQTLs add a new dimension. Trends Plant Sci. 13 72–77. [PubMed]
  • Hansen, B.G., Kliebenstein, D.J., and Halkier, B.A. (2007). Identification of a flavin-monooxygenase as the S-oxygenating enzyme in aliphatic glucosinolate biosynthesis in Arabidopsis. Plant J. 50 902–910. [PubMed]
  • Hirai, M., et al. (2007). Omics-based identification of Arabidopsis Myb transcription factors regulating aliphatic glucosinolate biosynthesis. Proc. Natl. Acad. Sci. USA 104 6478–6483. [PMC free article] [PubMed]
  • Hirai, M.Y., et al. (2005). Elucidation of gene-to-gene and metabolite-to-gene networks in Arabidopsis by integration of metabolomics and transcriptomics. J. Biol. Chem. 280 25590–25595. [PubMed]
  • Hoffmann, A.A., and Weeks, A.R. (2007). Climatic selection on genes and traits after a 100 year-old invasion: A critical look at the temperate-tropical clines in Drosophila melanogaster from eastern Australia. Genetica 129 133–147. [PubMed]
  • Jansen, R.C., and Nap, J.P. (2001). Genetical genomics: The added value from segregation. Trends Genet. 17 388–391. [PubMed]
  • Jeong, H., Tombor, B., Albert, R., Oltvai, Z.N., and Barabasi, A.L. (2000). The large-scale organization of metabolic networks. Nature 407 651–654. [PubMed]
  • Keurentjes, J.J.B., Fu, J.Y., de Vos, C.H.R., Lommen, A., Hall, R.D., Bino, R.J., van der Plas, L.H.W., Jansen, R.C., Vreugdenhil, D., and Koornneef, M. (2006). The genetics of plant metabolism. Nat. Genet. 38 842–849. [PubMed]
  • Keurentjes, J.J.B., Fu, J.Y., Terpstra, I.R., Garcia, J.M., van den Ackerveken, G., Snoek, L.B., Peeters, A.J.M., Vreugdenhil, D., Koornneef, M., and Jansen, R.C. (2007). Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc. Natl. Acad. Sci. USA 104 1708–1713. [PMC free article] [PubMed]
  • Kim, M., Lim, J.H., Ahn, C.S., Park, K., Kim, G.T., Kim, W.T., and Pai, H.S. (2006). Mitochondria-associated hexokinases play a role in the control of programmed cell death in Nicotiana benthamiana. Plant Cell 18 2341–2355. [PMC free article] [PubMed]
  • Kliebenstein, D., Lambrix, V., Reichelt, M., Gershenzon, J., and Mitchell-Olds, T. (2001. a). Gene duplication and the diversification of secondary metabolism: Side chain modification of glucosinolates in Arabidopsis thaliana. Plant Cell 13 681–693. [PMC free article] [PubMed]
  • Kliebenstein, D., West, M., van Leeuwen, H., Loudet, O., Doerge, R., and St. Clair, D. (2006). Identification of QTLs controlling gene expression networks defined a priori. BMC Bioinformatics 7 308. [PMC free article] [PubMed]
  • Kliebenstein, D.J. (2008). A role for gene duplication and natural variation of gene expression in the evolution of metabolism. PLoS One. 3 e1838. [PMC free article] [PubMed]
  • Kliebenstein, D.J., Figuth, A., and Mitchell-Olds, T. (2002. a). Genetic architecture of plastic methyl jasmonate responses in Arabidopsis thaliana. Genetics 161 1685–1696. [PMC free article] [PubMed]
  • Kliebenstein, D.J., Gershenzon, J., and Mitchell-Olds, T. (2001. b). Comparative quantitative trait loci mapping of aliphatic, indolic and benzylic glucosinolate production in Arabidopsis thaliana leaves and seeds. Genetics 159 359–370. [PMC free article] [PubMed]
  • Kliebenstein, D.J., Kroymann, J., Brown, P., Figuth, A., Pedersen, D., Gershenzon, J., and Mitchell-Olds, T. (2001. c). Genetic control of natural variation in Arabidopsis thaliana glucosinolate accumulation. Plant Physiol. 126 811–825. [PMC free article] [PubMed]
  • Kliebenstein, D.J., Pedersen, D., and Mitchell-Olds, T. (2002. b). Comparative analysis of insect resistance QTL and QTL controlling the myrosinase/glucosinolate system in Arabidopsis thaliana. Genetics 161 325–332. [PMC free article] [PubMed]
  • Kroymann, J., Donnerhacke, S., Schnabelrauch, D., and Mitchell-Olds, T. (2003). Evolutionary dynamics of an Arabidopsis insect resistance quantitative trait locus. Proc. Natl. Acad. Sci. USA 100 14587–14592. [PMC free article] [PubMed]
  • Lambrix, V., Reichelt, M., Mitchell-Olds, T., Kliebenstein, D., and Gershenzon, J. (2001). The Arabidopsis epithiospecifier protein promotes the hydrolysis of glucosinolates to nitriles and influences Trichoplusia ni herbivory. Plant Cell 13 2793–2807. [PMC free article] [PubMed]
  • Lexer, C., Rosenthal, D.M., Raymond, O., Donovan, L.A., and Rieseberg, L.H. (2005). Genetics of species differences in the wild annual sunflowers, Helianthus annuus and H. petiolaris. Genetics 169 2225–2239. [PMC free article] [PubMed]
  • Liu, B.-H. (1998). Statistical Genomics: Linkage, Mapping and QTL Analysis. (Boca Raton, FL: CRC Press).
  • Loudet, O., Chaillou, S., Camilleri, C., Bouchez, D., and Daniel-Vedele, F. (2002). Bay-0 × Shahdara recombinant inbred line population: A powerful tool for the genetic dissection of complex traits in Arabidopsis. Theor. Appl. Genet. 104 1173–1184. [PubMed]
  • Loudet, O., Chaillou, S., Merigout, P., Talbotec, J., and Daniel-Vedele, F. (2003). Quantitative trait loci analysis of nitrogen use efficiency in Arabidopsis. Plant Physiol. 131 345–358. [PMC free article] [PubMed]
  • Lynch, M., and Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. (Sunderland, MA: Sinauer Associates).
  • Magrath, R., Bano, F., Morgner, M., Parkin, I., Sharpe, A., Lister, C., Dean, C., Turner, J., Ludiate, D., and Mithen, R. (1994). Genetics of aliphatic glucosinolates. I. Side chain elongation in Brassica napus and Arabidopsis thaliana. Heredity 72 290–299.
  • Magrath, R., Herron, C., Giamoustaris, A., and Mithen, R. (1993). The inheritance of aliphatic glucosinolates in Brassica napus. Plant Breed. 111 55–72.
  • McMullen, M.D., Byrne, P.F., Snook, M.E., Wiseman, B.R., Lee, E.A., Widstrom, N.W., and Coe, E.H. (1998). Quantitative trait loci and metabolic pathways. Proc. Natl. Acad. Sci. USA 95 1996–2000. [PMC free article] [PubMed]
  • Meyer, R.C., Steinfath, M., Lisec, J., Becher, M., Witucka-Wall, H., Törjék, O., Fiehn, O., Eckardt, A., Willmitzer, L., Selbig, J., and Altmann, T. (2007). The metabolic signature related to high plant growth rate in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 104 4759–4764. [PMC free article] [PubMed]
  • Mithen, R., and Campos, H. (1996). Genetic variation of aliphatic glucosinolates in Arabidopsis thaliana and prospects for map based gene cloning. Entomol. Exp. Appl. 80 202–205.
  • Morley, M., Molony, C.M., Weber, T.M., Devlin, J.L., Ewens, K.G., Spielman, R.S., and Cheung, V.G. (2004). Genetic analysis of genome-wide variation in human gene expression. Nature 430 743–747. [PMC free article] [PubMed]
  • Mueller, L.A., Zhang, P.F., and Rhee, S.Y. (2003). AraCyc: A biochemical pathway database for Arabidopsis. Plant Physiol. 132 453–460. [PMC free article] [PubMed]
  • Nikiforova, V.J., Kopka, J., Tolstikov, V., Fiehn, O., Hopkins, L., Hawkesford, M.J., Hesse, H., and Hoefgen, R. (2005). Systems rebalancing of metabolism in response to sulfur deprivation, as revealed by metabolome analysis of Arabidopsis plants. Plant Physiol. 138 304–318. [PMC free article] [PubMed]
  • Nordborg, M., et al. (2005). The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 3 e196. [PMC free article] [PubMed]
  • Pfalz, M., Vogel, H., Mitchell-Olds, T., and Kroymann, J. (2007). Mapping of QTL for resistance against the crucifer specialist herbivore Pieris brassicae in a new Arabidopsis inbred line population, Da(1)-12×Ei-2. PLoS One. 2 e578. [PMC free article] [PubMed]
  • Potokina, E., Druka, A., Luo, Z., Wise, R., Waugh, R., and Kearsey, M. (2008). Gene expression quantitative trait locus analysis of 16 000 barley genes reveals a complex pattern of genome-wide transcriptional regulation. Plant J. 53 90–101. [PubMed]
  • Rebai, A. (1997). Comparison of methods for regression interval mapping in QTL analysis with non-normal traits. Genet. Rev. 69 69–74.
  • Roessner, U., Luedemann, A., Brust, D., Fiehn, O., Linke, T., Willmitzer, L., and Fernie, A.R. (2001). Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13 11–29. [PMC free article] [PubMed]
  • Sarrobert, C., Thibaud, M.C., Contard-David, P., Gineste, S., Bechtold, N., Robaglia, C., and Nussaume, L. (2000). Identification of an Arabidopsis thaliana mutant accumulating threonine resulting from mutation in a new dihydrodipicolinate synthase gene. Plant J. 24 357–367. [PubMed]
  • Sax, K. (1923). The association of size differences with seed-coat pattern and pigmentation in Phaseolus vulgaris. Genetics 8 552–560. [PMC free article] [PubMed]
  • Schadt, E.E., et al. (2003). Genetics of gene expression surveyed in maize, mouse and man. Nature 422 297–302. [PubMed]
  • Schauer, N., et al. (2006). Comprehensive metabolic profiling and phenotyping of interspecific introgression lines for tomato improvement. Nat. Biotechnol. 24 447–454. [PubMed]
  • Segre, D., DeLuna, A., Church, G.M., and Kishony, R. (2005). Modular epistasis in yeast metabolism. Nat. Genet. 37 77–83. [PubMed]
  • Sønderby, I.E., Hansen, B.G., Bjarnholt, N., Ticconi, C., Halkier, B.A., and Kliebenstein, D.J. (2007). A systems biology approach identifies a R2R3 MYB gene subfamily with distinct and overlapping functions in regulation of aliphatic glucosinolates. PLoS One. 2 e1322. [PMC free article] [PubMed]
  • Symonds, V.V., Godoy, A.V., Alconada, T., Botto, J.F., Juenger, T.E., Casal, J.J., and Lloyd, A.M. (2005). Mapping quantitative trait loci in multiple populations of Arabidopsis thaliana identifies natural allelic variation for trichome density. Genetics 169 1649–1658. [PMC free article] [PubMed]
  • Szalma, S.J., Buckler, E.S., Snook, M.E., and McMullen, M.D. (2005). Association analysis of candidate genes for maysin and chlorogenic acid accumulation in maize silks. Theor. Appl. Genet. 110 1324–1333. [PubMed]
  • Textor, S., Bartram, S., Kroymann, J., Falk, K.L., Hick, A., Pickett, J.A., and Gershenzon, J. (2004). Biosynthesis of methionine-derived glucosinolates in Arabidopsis thaliana: Recombinant expression and characterization of methylthioalkylmalate synthase, the condensing enzyme of the chain-elongation cycle. Planta 218 1026–1035. [PubMed]
  • Textor, S., de Kraker, J.W., Hause, B., Gershenzon, J., and Tokuhisa, J.G. (2007). MAM3 catalyzes the formation of all aliphatic glucosinolate chain lengths in Arabidopsis. Plant Physiol. 144 60–71. [PMC free article] [PubMed]
  • Uknes, S., Mauch-Mani, B., Moyer, M., Potter, S., Williams, S., Dincher, S., Chandler, D., Slusarenko, A., Ward, E., and Ryals, J. (1992). Acquired resistance in Arabidopsis. Plant Cell 4 645–656. [PMC free article] [PubMed]
  • Van Leeuwen, H., Kliebenstein, D.J., West, M.A., Kim, K.D., van Poecke, R., Katagiri, F., Michelmore, R.W., Doerge, R.W., and St. Clair, D.A. (2007). Natural variation among Arabidopsis thaliana accessions for transcriptome response to exogenous salicylic acid. Plant Cell 19 2099–2110. [PMC free article] [PubMed]
  • Wang, S., Basten, C.J., and Zeng, Z.-B. (2006). Windows QTL Cartographer 2.5. (Raleigh, NC: Department of Statistics, North Carolina State University).
  • Weckwerth, W., Loureiro, M.E., Wenzel, K., and Fiehn, O. (2004. a). Differential metabolic networks unravel the effects of silent plant phenotypes. Proc. Natl. Acad. Sci. USA 101 7809–7814. [PMC free article] [PubMed]
  • Weckwerth, W., Wenzel, K., and Fiehn, O. (2004. b). Process for the integrated extraction, identification, and quantification of metabolites, proteins and RNA to reveal their co-regulation in biochemical networks. Proteomics 4 78–83. [PubMed]
  • Wentzell, A.M., Rowe, H.C., Hansen, B.G., Ticconi, C., Halkier, B.A., and Kliebenstein, D.J. (2007). Linking metabolic QTL with network and cis-eQTL controlling biosynthetic pathways. PLoS Genet. 3 e162. [PMC free article] [PubMed]
  • West, M.A.L., Kim, K., Kliebenstein, D.J., van Leeuwen, H., Michelmore, R.W., Doerge, R.W., and St. Clair, D.A. (2007). Global eQTL mapping reveals the complex genetic architecture of transcript level variation in Arabidopsis. Genetics 175 1441–1450. [PMC free article] [PubMed]
  • West, M.A.L., van Leeuwen, H., Kozik, A., Kliebenstein, D.J., Doerge, R.W., St. Clair, D.A., and Michelmore, R.W. (2006). High-density haplotyping with microarray-based expression and single feature polymorphism markers in Arabidopsis. Genome Res. 16 787–795. [PMC free article] [PubMed]
  • Yagil, C., Barkalifa, R., Sapojnikov, M., Wechsler, A., Ben-Dor, D., Weksler-Zangen, S., Kaiser, N., Raz, I., and Yagil, Y. (2007). Metabolic and genomic dissection of diabetes in the Cohen rat. Physiol Genomics 29 181–192. [PubMed]
  • Zeng, Z.-B., Kao, C.-H., and Basten, C.J. (1999). Estimating the genetic architecture of quantitative traits. Genet. Res. 75 345–355.
  • Zhang, H.K., Zhang, X., Li, Q., and He, Z.H. (2004). Alpha-picolinic acid activates diverse defense responses of salicylic acid-, jasmonic acid/ethylene- and Ca2+-dependent pathways in Arabidopsis and rice suspension cells. Acta Bot. Sin. 46 1200–1205.
  • Zhang, P.F., Foerster, H., Tissier, C.P., Mueller, L., Paley, S., Karp, P.D., and Rhee, S.Y. (2005). MetaCyc and AraCyc. Metabolic pathway databases for plant research. Plant Physiol. 138 27–37. [PMC free article] [PubMed]
  • Zhang, Z.-Y., Ober, J.A., and Kliebenstein, D.J. (2006). The gene controlling the quantitative trait locus EPITHIOSPECIFIER MODIFIER1 alters glucosinolate hydrolysis and insect resistance in Arabidopsis. Plant Cell 18 1524–1536. [PMC free article] [PubMed]

Articles from The Plant Cell are provided here courtesy of American Society of Plant Biologists
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...