Statistical total correlation spectroscopy scaling for enhancement of metabolic information recovery in biological NMR spectra

Anal Chem. 2012 Jan 17;84(2):1083-91. doi: 10.1021/ac202720f. Epub 2011 Dec 27.

Abstract

The high level of complexity in nuclear magnetic resonance (NMR) metabolic spectroscopic data sets has fueled the development of experimental and mathematical techniques that enhance latent biomarker recovery and improve model interpretability. We previously showed that statistical total correlation spectroscopy (STOCSY) can be used to edit NMR spectra to remove drug metabolite signatures that obscure metabolic variation of diagnostic interest. Here, we extend this "STOCSY editing" concept to a generalized scaling procedure for NMR data that enhances recovery of latent biochemical information and improves biological classification and interpretation. We call this new procedure STOCSY-scaling (STOCSY(S)). STOCSY(S) exploits the fixed proportionality in a set of NMR spectra between resonances from the same molecule to suppress or enhance features correlated with a resonance of interest. We demonstrate this new approach using two exemplar data sets: (a) a streptozotocin rat model (n = 30) of type 1 diabetes and (b) a human epidemiological study utilizing plasma NMR spectra of patients with metabolic syndrome (n = 67). In both cases significant biomarker discovery improvement was observed by using STOCSY(S): the approach successfully suppressed interfering NMR signals from glucose and lactate that otherwise dominate the variation in the streptozotocin study, which then allowed recovery of biomarkers such as glycine, which were otherwise obscured. In the metabolic syndrome study, we used STOCSY(S) to enhance variation from the high-density lipoprotein cholesterol peak, improving the prediction of individuals with metabolic syndrome from controls in orthogonal projections to latent structures discriminant analysis models and facilitating the biological interpretation of the results. Thus, STOCSY(S) is a versatile technique that is applicable in any situation in which variation, either biological or otherwise, dominates a data set at the expense of more interesting or important features. This approach is generally appropriate for many types of NMR-based complex mixture analyses and hence for wider applications in bioanalytical science.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Biomarkers / analysis*
  • Case-Control Studies
  • Cohort Studies
  • Diabetes Mellitus, Experimental / blood*
  • Discriminant Analysis*
  • Epidemiologic Studies
  • Humans
  • Male
  • Metabolic Syndrome / blood*
  • Metabolic Syndrome / epidemiology
  • Metabolome*
  • Nuclear Magnetic Resonance, Biomolecular*
  • Rats
  • Rats, Sprague-Dawley

Substances

  • Biomarkers