Comparison of seven modelling algorithms for γ‐aminobutyric acid–edited proton magnetic resonance spectroscopy

Edited MRS sequences are widely used for studying γ‐aminobutyric acid (GABA) in the human brain. Several algorithms are available for modelling these data, deriving metabolite concentration estimates through peak fitting or a linear combination of basis spectra. The present study compares seven such algorithms, using data obtained in a large multisite study. GABA‐edited (GABA+, TE = 68 ms MEGA‐PRESS) data from 222 subjects at 20 sites were processed via a standardised pipeline, before modelling with FSL‐MRS, Gannet, AMARES, QUEST, LCModel, Osprey and Tarquin, using standardised vendor‐specific basis sets (for GE, Philips and Siemens) where appropriate. After referencing metabolite estimates (to water or creatine), systematic differences in scale were observed between datasets acquired on different vendors' hardware, presenting across algorithms. Scale differences across algorithms were also observed. Using the correlation between metabolite estimates and voxel tissue fraction as a benchmark, most algorithms were found to be similarly effective in detecting differences in GABA+. An interclass correlation across all algorithms showed single‐rater consistency for GABA+ estimates of around 0.38, indicating moderate agreement. Upon inclusion of a basis set component explicitly modelling the macromolecule signal underlying the observed 3.0 ppm GABA peaks, single‐rater consistency improved to 0.44. Correlation between discrete pairs of algorithms varied, and was concerningly weak in some cases. Our findings highlight the need for consensus on appropriate modelling parameters across different algorithms, and for detailed reporting of the parameters adopted in individual studies to ensure reproducibility and meaningful comparison of outcomes between different studies.


B.2 Discussion
Whilst a flexible baseline can provide a visually appealing model, i.e., reduce apparent residuals, this does not necessarily lead to more accurate metabolite estimates. This is particularly apparent for the case of Osprey (without the MM3co component), where residuals associated with macromolecule contamination and the 3.2 ppm artefact pull the baseline far into the peak at 3 ppm, potentially leading to a possible under-estimation of the corresponding GABA+ signal. In the present study, stiffer baseline models appeared to be more robust towards small artefacts, allowing the model to reasonably follow the Gaussian contour of the edited GABA+ signal without absorbing too much of it into the baseline.
Nonetheless, using correlation with grey matter fraction as a benchmark, the performance of each algorithm without an explicit MM3 basis component, allowing a stiff but non-zero baseline to absorb the macromolecule signal, was comparable with more elaborate constrained configurations for assessment of GABA+. Moreover, when reporting GABA separately, such a baseline model showed a tendency towards better performance (for LCModel) than other approaches. All configurations except LCModel 0.6 BL 1:1 GABA were able to assess GABA/H2O separately with comparable effectiveness to GABA+/H2O from the same algorithm. However, we note that elevated MM3 content in white matter 2,3 may moderate correlation in the case of GABA+. Divergent outcomes for separate GABA/H2O and MM3co/H2O estimates across different vendors may be of concern, and requires further attention.

C Difference spectrum modelling
Further details on specific parameters adopted for modelling the difference spectrum by each algorithm are presented here; this supplements details in the main article, section 2.3.
Basis set construction is described in the main paper, section 2.3.1. Basis sets were exported both in their original resolution (8192 samples at 4 kHz sweep width) and after resampling to match the resolution of the acquired data for each manufacturer/site, to cater for those algorithms which were unable to resample effectively internally (QUEST, Tarquin).
The LCModel basis set format was used as an intermediary for import into LCModel, Tarquin and FSL-MRS, while individual jMRUI-format text files were required for assembly into a jMRUI-compatible metabolite list for QUEST. Typical invocation of FSL-MRS for GABA-edited difference spectra was with the following parameters:

C.1 FSL-MRS
Defaults: GannetFit was run on data from the standardised processing pipeline (reported as "Gannet"), and on data processed with Gannet's own default pipeline, denoted "Gannet (native)". The latter applies zero-filling (to 0.062 Hz spectral resolution), and line-broadening In earlier testing with the supplied NMRScope-B tool and the supplied megapress_2edit_voi_1pws protocol (configured for TE = 68 ms, TE1 = 13.1 ms, water suppression at 4.7 ppm and 15 ms, 180º gaussian editing pulses at 1.9 and 7.5 ppm adjusted to an "observation offset" of 2 ppm), both the NAAG and GSH components consistently failed to simulate: a complete basis set could not be produced with this tool alone.
The assembled metabolite lists were subsequently used to model batches of data grouped by manufacturer and spectral resolution. The QUEST algorithm was invoked with the zero-order phase fixed to 0 degrees. Other parameters were according to default: no first-order phase; damping factor a (proportional to linewidth, DFWHM = a/p) and frequency shift, per basis function, set on the range -10 to 40 Hz and -10 to 10 Hz, respectively; no baseline terms were included.

C.5 LCModel
LCModel 11 models data with a linear combination of basis sets. Version 6. 3-1P  LCModel was invoked with the following parameters for modelling GABA-edited difference spectra: • SPTYPE='mega-press-3' sets the "special type" appropriately for MEGA- Processed data and basis sets were supplied to Tarquin in the corresponding LCModel file formats; a minor bugfix (subsequently published upstream) was required for LCM basis set import (https://github.com/martin3141/tarquin/commit/cdc44df).
NAA was used as a reference for basis set scaling. Appropriate TE, sweep width and centre frequency (echo, fs, ft) were specified on the command line. Automatic phasing and referencing (auto_phase and auto_ref) were disabled. The start point for analysis of the difference spectra (start_pnt) was set at 5 ms (calculated as 0.005 times the sample frequency). Eddy-current correction was disabled (--water_eddy false). HSVD was performed (per defaults) to remove residual water. The visual baseline (Figure 2f) is modelled from the residuals, smoothed with a cosine kernel. Outcomes from this quantification are reported as "Tarquin".
Tarquin can internally simulate basis sets on demand, using an implementation of the density matrix formulation of NMR 15,16 and the parameters of 17 ; an additional analysis was performed using this internally generated basis set: --int_basis megapress_gaba for the difference spectra. This is simulated assuming a simplified system of uncoupled spins, similar to that used by peak-integration algorithms, comprising GABA_A ( --ext_pdf true --output_txt S12_GABA_68.diff.tarquin.txt --output_csv S12_GABA_68.diff.tarquin.csv --output_pdf S12_GABA_68.diff.pdf --output_fit S12_GABA_68.tarquin.fit For "Tarquin" (with standard basis set, with or without MM3): --basis_lcm ge_diff_4096_5000.basis Or for "Tarquin (internal)": -- with segmentation of the corresponding per-subject structural T1-weighted image performed using the unified tissue segmentation algorithm of SPM12 21 .
Concentration estimates scaled to water but with no adjustment for tissue class (assuming pure water concentration per eq(3) of 22 ) were also calculated, hereafter denoted "/H2OnoTC"; see Equation 2.

D.2 Gannet
Gannet models the water peak as a Lorentzian multiplied by a Gaussian at the same centre frequency, incorporating a linear baseline into the model. The water area is taken as the sum of that optimized function over the 3. 8 -5.6

D.3 jMRUI: AMARES
Although jMRUI provides basic water referencing mechanisms internally, based on the amplitude of the reference FID, this functionality could not be batched effectively.
Therefore, a separate AMARES run was used to model the water reference, using a superposition of one Gaussian and one Lorentzian component to mimic a pseudo-Voigt model. The centre frequency for the two components was constrained to be equal, with starting values for frequency and linewidth of 4.65 ppm and 7 Hz respectively; zero-order phase and begin time were set to zero.
Intrinsic amplitudes are reported for both the water reference and the metabolite models (i.e., SM and SH2O terms are available directly), hence there is no need to reverse any applied scaling in this case.

D.4 jMRUI: QUEST
Water referencing is performed identically to the AMARES case (D.3), using the same values for water amplitude.

D.5 LCModel
LCModel measures area of the water peak by integration: it first locates the maximum value of the frequency-domain water reference data within the expected range (4.65 +/-1 ppm), then integrates the phase-corrected water reference +/-2 ppm around that maximum, assuming a linear baseline between the border regions. Refer LCModel.f lines 546, 4940, 5063 (available via http://www.s-provencher.com/lcmodel.shtml).
LCModel applies a scaling factor to the data, such that the signal strength per proton resonance is consistent between the data and the basis set (see also the LCModel user manual, http://www.s-provencher.com/pub/LCModel/manual/manual.pdf section 10.2): . scaling (see C.5); QRST 18< . and UVV5WV are appropriate to the basis set (both 1.0), #/ Met = 3 for the 2.01 ppm NAA peak. Given a consistently-scaled basis set, this is equivalent to scaling of the form: ATTH2O accounts for the attenuation of NMR-visible water signal due to additional relaxation effects, approximated as exp B

D.6 Osprey
Osprey uses a simulated water basis function to model the water reference, on a fit range from 2-7.4 ppm.
While the standardised processing pipeline evaluates the difference spectrum as diff = ( edit-OFF -edit-ON ), Osprey assumes a different convention for this: (edit-OFF -edit-ON) / 2. Therefore, an additional factor-of-two correction is needed in this instance.

D.7 Tarquin
Tarquin uses the infinity norm (peak absolute amplitude) of the time-domain water reference data for normalisation to water. The reported value is given as: and 0.76 respectively, noting that the latter is unlikely to be optimal for TE = 68 ms.

E Creatine Reference: Edit-OFF sub-spectrum modelling
For modelling of the edit-OFF sub-spectra by basis set algorithms (FSL-MRS, QUEST, LCModel, Osprey and Tarquin), a standard simulated basis set specific to each hardware vendor was adopted. These were derived for edit-OFF sub-spectra using a similar method to that used for the difference spectra (as detailed in section 2.

E.1 FSL-MRS
Edit-OFF sub-spectra were quantified with the standard edit-OFF basis set, using Cr+PCr as an internal reference for basis set scaling. For expedience, the default Newton optimisation algorithm was used; otherwise, all parameters were the same as for the difference case.

E.2 Gannet
In addition to the GABA+Glx model, Gannet independently fits a Lorentzian model to NAA around 2.01 ± 0.04 ppm, and a dual-Lorentzian model for Choline and Creatine, the latter centred at 3.02 ppm with a fixed 0.18 ppm separation, all from the edit-OFF subspectrum. A minor local modification was made to additionally yield water-referenced estimates from the existing Cr and NAA fits (usually only reported for those metabolites contained in the difference spectrum).

E.4 jMRUI: QUEST
The standard per-manufacturer edit-OFF basis sets described above were imported in jMRUI text format, manually aligned for NAA @ 2.0 ppm, before assembly into a jMRUIformat metabolite list. Other model parameters were the same as for the difference spectra.

E.5 LCModel
Edit-OFF sub-spectra were modelled with the standard per-manufacturer edit-OFF basis sets. Basic configuration was similar to that used for the difference spectra, with two key exceptions. PPMGAP was not defined -which implies fitting over the full defined fit range (0. 2 -4.2 ppm), the default mode of operation. SPTYPE was also not defined, again implying a default fit: a regular spline baseline was modelled (with default knot spacing 0.15 ppm), Creatine was used for internal basis set scaling, and LCModel's default set of soft constraints were adopted.

E.6 Osprey
Edit-OFF sub-spectra were modelled using the standard per-manufacturer edit-OFF basis sets, with otherwise similar parameters to those used for the difference spectra. This happened in the same invocation of OspreyFit.

I Additional Correlation Analyses
Supplementary Figure Figure 1: Processing (b) and modelling (c) workflow, summarising key differences between the algorithms assessed. Figure 2 Average metabolite and baseline (where applicable) models with corresponding residuals for the GABA+ edited spectra, for each algorithm. Vertical scaling is normalised; outcomes over the full fit range are presented in Supplementary Figure  8; outcomes split by vendor are presented in Supplementary Figure 9.        Table 6: Mean concentration estimates (institutional units, » mM) from each algorithm, reported across all subjects. Estimates grouped by vendor are expressed as a % difference relative to the mean across all subjects for the respective algorithm; significance indicated by *, **, *** for pholm < .05, .01 and .001 respectively.