![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2005, The National Academy of Sciences Plant Biology Hierarchical metabolomics demonstrates substantial compositional similarity between genetically modified and conventional potato crops *Max Planck Institute for Molecular Plant Physiology, D-14424 Golm, Germany; and ‡Institute of Biological Sciences and §Department of Computer Science, University of Wales, Aberystwyth SY23 3DA, United Kingdom ** To whom correspondence should be addressed. E-mail: jhd/at/aber.ac.uk. †G.S.C., M.B., and O.F. contributed equally to this work. ¶Present address: Institute of Grassland and Environmental Research, Plas Gogerddan, Aberystwyth SY23 3EB, United Kingdom. Present address: Department of Chemistry, Faraday Building, University of Manchester, Manchester M60 1QD, United Kingdom.Edited by Marc C. E. Van Montagu, Ghent University, Ghent, Belgium, and approved August 12, 2005 Received May 12, 2005. This article has been cited by other articles in PMC.Abstract There is current debate whether genetically modified (GM) plants might contain unexpected, potentially undesirable changes in overall metabolite composition. However, appropriate analytical technology and acceptable metrics of compositional similarity require development. We describe a comprehensive comparison of total metabolites in field-grown GM and conventional potato tubers using a hierarchical approach initiating with rapid metabolome “fingerprinting” to guide more detailed profiling of metabolites where significant differences are suspected. Central to this strategy are data analysis procedures able to generate validated, reproducible metrics of comparison from complex metabolome data. We show that, apart from targeted changes, these GM potatoes in this study appear substantially equivalent to traditional cultivars. Keywords: genetically modified substantial equivalence, machine learning There is concern that genetic engineering may allow introduction of unforeseen traits into crops, causing them to contain undesirable metabolites (1, 2). “Substantial equivalence” is used as the starting point to structure current food safety assessment and suggests comparison of intended differences between the genetically modified (GM) plant and progenitor cultivar (1, 2). We compared field-grown tubers from conventional potato cultivars and genotypes bioengineered to contain high levels of inulin-type fructans (3, 4). Inulins stimulate bifidobacteria growth in the intestine and help to boost digestive tract pathogen resistance (5). The beneficial effects of inulins as prebiotic food supplements have been well publicized; thus, this metabolic pathway provides a readily understandable scientific context. Two classes of experimental transgenic line developed in the cultivar Désirée were investigated. The first transgene coded for the enzyme sucrose:sucrose 1-fructosyltransferase (SST), which transfers a fructosyl residue from one sucrose molecule to another, producing the trisaccharide 1-kestose, and oligofructans up to 5 degrees of polymerization (DP) (3, 4). The second transgene was fructan:fructan 1-fructosyltransferase (FFT), the product of which utilizes 1-kestose (and other oligofructans) to build inulin polymers (3, 4). In any compositional comparison it is important to develop robust metabolomics methodology allowing for, as near as possible, a global analysis of metabolite content (6-8). Established methods for metabolite analysis include gas chromatography, HPLC, or capillary electrophoresis, usually linked to mass spectrometers (9-11). Such approaches result in detailed knowledge relating to only a subset of previously characterized metabolites (6-11), and studies thus far have been restricted to single, relatively small batches of plants produced under controlled growth conditions (9, 12-14). For an initial screen of overall compositional similarity, we propose more rapid and less selective fingerprinting techniques that do not incorporate a chromatographic step (8, 15-18). Fingerprints based on MS, such as flow injection electrospray ionization (FIE)-MS, can be regarded as simplified images of total sample composition in that the measured variables (m/z) are compiled by integrating the levels of more than one metabolite (e.g., for isomers). Where compositional differences unrelated to the bioengineered trait are suggested, substantial equivalence testing can be applied to more detailed metabolome analysis involving a chromatographic step guided by the fingerprinting results. Defining substantial equivalence does not fall neatly into a standard statistical task. Unsupervised data analysis techniques, such as principal components analysis (PCA) (19) look for regularities in unlabeled data. Supervised techniques, such as linear discriminant analysis (LDA) (19, 20) and decision tree analysis (21), build models that discriminate between labeled data (22, 23). However, for substantial equivalence we are interested in data similarity rather than the ability to discriminate classes. We reason that if an unsupervised algorithm clusters metabolome samples close together, then they can be objectively considered to be similar, and if classes cannot easily be discriminated by supervised methods then they are objectively similar. The overall experimental approach was to initially evaluate the degree of compositional similarity between tubers of individual traditional potato cultivars. This comparison provided a context for determining whether transgenic potatoes displayed alterations in metabolite composition outside the range exhibited normally by conventional cultivars. To ensure comprehensive coverage of the metabolome, a hierarchical approach was adopted that initially involved a nonselective metabolite fingerprinting technique followed by more detailed global profiling of individual metabolites and finally a targeted analysis of any metabolites responsible for discriminating GM genotypes. Data-mining methods were used that were specifically capable of identifying metabolites responsible for differences between potato genotypes. The use of several different data analysis methods ensured that any conclusions relating to metrics of similarity were independent of specific statistical treatments. Materials and Methods Plant Material. The experimental transgenic genotypes derived from the progenitor cultivar Désirée are described in ref. 3. The GM plants were grown under field conditions in a block design for the 2001 and 2003 growing seasons together with the conventional cultivars Agria, Linda, Granola, Solara, and two Désirée lines [one line was propagated through tissue culture (De2), and the other was obtained from tuber propagation]. Approximately 48 tubers were selected at random from each of four randomly arranged field blocks and stored at 4°C for 4 weeks before sample preparation. Potato tuber disks (fresh weight, 200 mg each) were excised from 3 mm below the tuber peel, perpendicular to the main tuber axis. Immediately after cutting, disks were frozen in liquid nitrogen and kept frozen at -80°C before extraction. Sample Preparation and Metabolite Analysis. Tuber slice homogenization and extraction in 1 ml of prechilled water/methanol/chloroform (2:5:2, vol/vol/vol) and GC TOF-MS analysis were carried out as described in ref. 24. FIE-MS was performed with an LCT mass spectrometer (Micromass, Manchester, U.K.) as described in detail in Supporting Materials and Methods, which is published as supporting information on the PNAS web site. Randomized extracts were diluted 1:50 in water/methanol (60:40, vol/vol), and aliquots of 40 μl were injected into a flow of 100 μl·min-1 water/methanol (60:40, vol/vol) with a Waters Alliance 2690 liquid chromatography (LC) system. LC-MS-targeted analysis for glycoalkaloids and oligofructans was performed with a LCQ Quantum triple quadrupole mass spectrometer (ThermoFinnigan, San Jose, CA) running xcalibur software (version 1.3, ThermoFinnigan) as described in ref. 25. Confirmation of 1-kestose presence besides raffinose in the Solara, Linda, SST, and SST/FFT lines was performed by hydrophilic-interaction LC (10) and triple-quadrupole MS in MRM mode on fragmentations of parent ion m/z 522 to m/z 325, 163, 145, 127, and 85. Chromatograms were processed with lcquan (xcalibur, version 1.3). Data Analysis. FIE-MS raw data were first log-transformed and then normalized to the total ion current before analysis. All GC-TOF data were normalized to total peak area and then log-transformed. The latter data matrix contained 15.4% missing values, being either below detection limit (true low values) or missed because of failures of the automatic deconvolution and peak detection software (missing values). The 1-kestose (expected new metabolite in GM lines) region in all 2,253 chromatograms was manually checked and corrected because this molecule was found to have a retention time very close to that of raffinose. Undetected peaks were excluded in the univariate analysis. Boundaries delimiting the relative concentration range of each metabolite observed by GC-TOF in the conventional cultivars were first determined, and the level of each metabolite in GM lines was then compared to the specific limits set for it. From frequency distributions of metabolites in cultivars regarded as “safe,” upper and lower limits of commonly detected relative metabolite levels were calculated. One-sigma deviations from cultivar mean levels were regarded as a conservative borderline of typically found food metabolite levels. For each, one standard deviation from each comparator group mean was calculated, and the overall maximum and minimum were taken as conservative estimations of the extents of acceptability. As a further test, it was determined also whether the mean of an individual GM line differed significantly from the mean of each of the cultivars. Nonparametric multiple comparisons corrected for unequal sample sizes with tied ranks (described in ref. 26) were performed with the r environment (27) and the results presented as Q values. For multivariate analysis, each initial data matrix was split randomly into a training set and a test set (two-thirds and one-third, respectively). This method of division allows a direct comparison of the accuracies of any model using McNemar's test (28, 29). Some multivariate methods (e.g., PCA) require complete data matrices (19, 30, 31); therefore, when required, the overall mean of the peak intensity taken from the training set was applied to in-fill the missing values in the training and validation sets. PCA (30) as carried out by using matlab (version 6.5, release 13; Mathworks, Natick, MA) on the mean-centered covariance matrix of the training set. The training set only was used to build PCA models. LDA (19) [also referred to in chemometrics literature (17) as discriminant function analysis] was implemented in matlab according to the procedure described in ref. 17. Decision tree analysis was carried out on the original data matrix (without in-filling) and in the mean in-filled data matrix using an implementation of the c4.5 algorithm (21) in the rpart package in r (27). The results of the analysis on the original data are presented, but broadly similar overall classification accuracies were achieved by using both data sets. Results Potato Genotypes Have Distinct Metabolomes. FIE-MS fingerprints were generated for 600 samples representing all genotypes selected randomly from four field plots. PCA showed that metabolome variation was dominated by the three major genotype metaclasses (cultivars, SST, and SST/FFT) (Fig. 1A
The Most Discriminatory Ions Are Derived from Fructans. The GM lines had been engineered to synthesize novel metabolites; therefore, the virtually complete separation of GM and non-GM lines in PCA space was not unexpected. Investigation of the relative contribution (loadings) of individual variables in the PC1 dimension highlighted 15 ions with a significant impact on genotype separation (Fig. 2A
Only Anticipated Metabolites Were Found in GM Lines. Analysis was extended in scope and depth in the next layer of data acquisition and testing for which 2,182 tubers were analyzed from the 12 genotypes, again randomized over all field plots. GC-TOF-MS (24) recorded 252 metabolite peaks in an automated manner (90 positively identified, 89 assigned to a specific metabolite class, and 73 classified as unknowns). The chromatographic region associated with the retention time of major disaccharides and trisaccharides of several chromatograms representative of the major genotype groups is shown in Fig. 3B
Analysis of GC-TOF data by PCA, LDA, and decision trees revealed a similar pattern of genotype clustering/discrimination to that observed in the fingerprinting analysis (Fig. 4C Glycoalkaloid Levels Are Normal in GM Potatoes. We have concluded from a metabolomics study incorporating a range of different data analysis techniques that only six important fructosyl peaks resulted from the genetic modifications in potato. Disregarding this finding of only minor changes in oligosaccharide metabolism, the possibility of changes in possibly toxic, low level, secondary metabolites could not be excluded apriori. Further targeted analysis (Fig. 9, which is published as supporting information on the PNAS web site) revealed no changes in the levels of glycosidic steroidal alkaloids (α-chaconine and α-solanine), which usually comprise up to 95% of the total glycoalkaloid content of tubers from domesticated Solanum tuberosum cultivars (32). Discussion The nature of food in terms of safety cannot be assessed in an absolute manner. As a first pass in any compositional comparison, we suggest that a rapid but sensitive comprehensive and comparably inexpensive first screen can be provided by mass spectrometric fingerprinting, which may be complemented by more detailed analyses using GC-TOF or LC-MS, depending on the level of similarity to other cultivars as determined by statistical analysis. A major finding from the present study was the large variation in metabolite profile between the conventional cultivars. These significant differences were never sought as desired traits in traditional breeding programs, and overall composition has not given cause for public safety concerns in conventionally bred cultivars. In the context of substantial equivalence, we show that the metabolite composition of field-grown inulin-producing potatoes were within the natural metabolite range of classical cultivars and were, in fact, very similar to the progenitor line Désirée, with the exception of the introduced genes and, therefore, the predictable up-regulation of fructans and their expected derivatives. In the comparative assessment framework, such metabolic side products might eventually be subjected to more detailed investigations if deemed necessary with respect to toxicity, abundance, and chemical structure. The cultivar-based compositional heterogeneity we describe emphasizes the importance of comparison with a range of equivalent cultivars and not solely the parental line. For example, although 1-kestose was not found in the genetic background line of the GM plants, Désirée, a trisaccharide indistinguishable from 1-kestose was found in Solara and Linda tubers (see Fig. 3B Supporting Information
Acknowledgments We thank Bernd Hommel, Pia Roppel, and colleagues for designing and undertaking the field trials under Bundesanstalt für Land und Forstwirtschaft Project 0312632; Karin Koehl for study design; Arnd Heyer (Max Planck Institute for Molecular Plant Physiology) for generation and supply of transgenic material and helpful discussion; André van Laere and Wim van den Ende (Katholieke University, Leuven, Belgium) and Jerry Chatterton and Phil Harrison (Utah State University, Logan) for providing 2- to 4-DP fructan reference compounds; Jim Heald and Robert Darby for supporting LCT analysis; and Roy Goodacre and David Broadhurst for advice on data analysis. The metabolite analysis and statistical work was funded by the Food Standards Agency (London) as part of its G02006 project. Notes Author contributions: N.H., A.S., R.D.K., D.B.K., O.F., and J.D. designed research; G.S.C., M.B., M.M., B.Z., and O.F. performed research; G.S.C., M.B., D.P.E., J.T., N.H., R.D.K., and J.D. analyzed data; G.S.C., M.B., D.P.E., R.D.K., O.F., and J.D. wrote the paper; and J.D. coordinated the project consortium. This paper was submitted directly (Track II) to the PNAS office. Abbreviations: GM, genetically modified; SST, sucrose:sucrose 1-fructosyltransferase; FFT, fructan:fructan 1-fructosyltransferase; DP, degree(s) of polymerization; PCA, principal components analysis; FIE, flow injection electrospray ionization; LDA, linear discriminant analysis; DF, discriminant functions. References 1. Organisation for Economic Cooperation and Development (2001. ) Report of the OECD Workshop on the Nutritional Assessment of Novel Foods and Feeds (Org. Econ. Cooperation Dev., Ottawa). 2. Kok, E. J. & Kuiper, H. A. (2003. ) Trends in Biotechnol. 21, 439-444. 3. Hellwege, E. M., Czapla, S., Jahnke, A., Willmitzer, L. & Heyer, A. G. (2000. ) Proc. Natl. Acad. Sci. USA 97, 8699-8704. [PubMed] 4. Edelman, J. & Jefford, T. G. (1968. ) New Phytol. 67, 517-531. 5. Gibson, G. R., Beatty, E. R., Wang, X. & Cummings, J. H. (1995. ) Gastroenterology 108, 968-975. 6. Fiehn, O. (2002. ) Plant Mol. Biol. 48, 155-171. [PubMed] 7. Sumner, L. W., Mendes, P. & Dixon, R. A. (2003. ) Phytochemistry 62, 817-836. [PubMed] 8. Goodacre, R., Vaidyanathan, S., Dunn, W. B., Harrigan, G. G. & Kell, D. B. (2004. ) Trends Biotechnol. 22, 439-444. 9. Roessner, U., Wagner, C., Kopka, J., Trethewey, R. N. & Willmitzer, L. (2000. ) Plant J. 23, 131-142. [PubMed] 10. Tolstikov, V. V. & Fiehn, O. (2003. ) Anal. Biochem. 301, 298-307. 11. Sato, S., Soga, T., Nishioka, T. & Tomita, M. (2004. ) Plant. J. 40, 151-163. [PubMed] 12. Taylor, J., King, R. D., Altmann, T. & Fiehn, O. (2002. ) Bioinformatics 18, S241-S248. 13. Fiehn, O., Kopka, J., Altmann, T., Trethewey, R. & Willmitzer, L. (2000. ) Nat. Biotechnol. 18, 1157-1161. [PubMed] 14. Roessner, U., Willmitzer, L. & Fernie, A. R. (2001. ) Plant Physiol. 127, 749-764. [PubMed] 15. Ward, J. L., Harris, C., Lewis, J. & Beale, M. H. (2003. ) Phytochemistry 62, 949-957. [PubMed] 16. Aharoni, A., De Vos, C. H. R., Verhoeven, H. A., Maliepaard, C. A., Kruppa, G., Bino, R. & Goodenowe, D. B. (2002. ) OMICS 6, 217-234. [PubMed] 17. Allen, J., Davey, H. M., Broadhurst, D., Heald, J. K., Rowland, J. J., Oliver, S. G. & Kell, D. B. (2003. ) Nat. Biotechnol. 21, 692-696. [PubMed] 18. Scholz, M., Gatzek, S., Sterling, A., Fiehn, O. & Selbig, J. (2004. ) Bioinformatics 20, 2447-2454. [PubMed] 19. Manley, B. F. J. (1994. ) Multivariate Statistical Methods: A Primer (Chapman & Hall, London). 20. Goodacre, R., Timmins, E. M., Burton, R., Kaderbhai, N., Woodward, A. M., Kell, D. B. & Rooney, P. J. (1998. ) Microbiology 144, 1157-1170. [PubMed] 21. Quinlan, J. R. (1993. ) c4.5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, CA). 22. Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K. & Zhao, H. (2004. ) Bioinformatics 19, 1636-1643. 23. Kell, D. B., Darby, R. M. & Draper, J. (2001. ) Plant Physiol. 126, 943-951. [PubMed] 24. Weckwerth, W., Loureiro, M. E., Wenzel, K. & Fiehn, O. (2004. ) Proc. Natl. Acad. Sci. USA 101, 7809-7814. [PubMed] 25. Zywicki, B., Catchpole, G., Draper, J. & Fiehn, O. (2004. ) Anal. Biochem. 336, 178-186. 26. Zar, J. H. (1984. ) in Biostatistical Analysis (Prentice-Hall, Englewood Cliffs, NJ), 2nd Ed, p. 201. 27. Gentleman, R. & Ihaka, R. (2004. ) r: A Language and Environment for Statistical Computing and Graphics (Univ. Auckland, Auckland, Australia). 28. McNemar, Q. (1947. ) Psychometrika 12, 153-157. 29. Dietterich, T. G. (1998. ) Neural Comput. 10, 1895-1923. [PubMed] 30. Jolliffe, I. T. (1986. ) Principal Component Analysis (Springer, New York). 31. Little, R. J. A. & Rubin, D. B. (1987. ) Statistical Analysis with Missing Data (Wiley, New York). 32. Van Gelder, W. M. J. (1991. ) in Poisonous Plant Contamination of Edible Plant, ed. Abdel-Fattah, M. (CRC Press, Boca Raton, FL). |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Proc Natl Acad Sci U S A. 2000 Jul 18; 97(15):8699-704.
[Proc Natl Acad Sci U S A. 2000]Plant Mol Biol. 2002 Jan; 48(1-2):155-71.
[Plant Mol Biol. 2002]Plant J. 2000 Jul; 23(1):131-42.
[Plant J. 2000]Plant J. 2004 Oct; 40(1):151-63.
[Plant J. 2004]Plant Physiol. 2001 Nov; 127(3):749-64.
[Plant Physiol. 2001]Phytochemistry. 2003 Mar; 62(6):949-57.
[Phytochemistry. 2003]Microbiology. 1998 May; 144 ( Pt 5)():1157-70.
[Microbiology. 1998]Plant Physiol. 2001 Jul; 126(3):943-51.
[Plant Physiol. 2001]Proc Natl Acad Sci U S A. 2000 Jul 18; 97(15):8699-704.
[Proc Natl Acad Sci U S A. 2000]Proc Natl Acad Sci U S A. 2004 May 18; 101(20):7809-14.
[Proc Natl Acad Sci U S A. 2004]Neural Comput. 1998 Sep 15; 10(7):1895-1923.
[Neural Comput. 1998]Nat Biotechnol. 2003 Jun; 21(6):692-6.
[Nat Biotechnol. 2003]Microbiology. 1998 May; 144 ( Pt 5)():1157-70.
[Microbiology. 1998]Plant Physiol. 2001 Jul; 126(3):943-51.
[Plant Physiol. 2001]Neural Comput. 1998 Sep 15; 10(7):1895-1923.
[Neural Comput. 1998]Proc Natl Acad Sci U S A. 2004 May 18; 101(20):7809-14.
[Proc Natl Acad Sci U S A. 2004]