Format

Send to

Choose Destination
Neuroimage. 2019 Oct 1;199:351-365. doi: 10.1016/j.neuroimage.2019.05.082. Epub 2019 Jun 5.

Quantifying performance of machine learning methods for neuroimaging data.

Author information

1
School of Psychology, Trinity College Dublin, Dublin, Ireland; Department of Translational Research in Psychiatry, Max-Planck Institute of Psychiatry, Munich, Germany.
2
School of Psychology, Trinity College Dublin, Dublin, Ireland.
3
Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 "Neuroimaging & Psychiatry", University Paris Sud, University Paris Descartes - Sorbonne Paris Cité, and Psychiatry Department 91G16, Orsay Hospital, France.
4
Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Square J5, 68159, Mannheim, Germany.
5
Medical Research Council - Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, United Kingdom.
6
NeuroSpin, CEA, Université Paris-Saclay, F-91191, Gif-sur-Yvette, France.
7
Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 "Neuroimaging & Psychiatry", University Paris Sud, University Paris Descartes - Sorbonne Paris Cité, and Maison de Solenn, Paris, France.
8
Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital and Departments of Psychology and Psychiatry, University of Toronto, Toronto, Ontario, M6A 2E1, Canada.
9
Department of Psychiatry and Neuroimaging Center, Technische Universität Dresden, Dresden, Germany.
10
Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charitéplatz 1, Berlin, Germany.
11
Department of Psychiatry, University of Vermont, Burlington, USA.
12
School of Psychology, Trinity College Dublin, Dublin, Ireland; Global Brain Health Institute, Trinity College Dublin, Dublin, Ireland. Electronic address: .robert.whelan@tcd.ie.

Abstract

Machine learning is increasingly being applied to neuroimaging data. However, most machine learning algorithms have not been designed to accommodate neuroimaging data, which typically has many more data points than subjects, in addition to multicollinearity and low signal-to-noise. Consequently, the relative efficacy of different machine learning regression algorithms for different types of neuroimaging data are not known. Here, we sought to quantify the performance of a variety of machine learning algorithms for use with neuroimaging data with various sample sizes, feature set sizes, and predictor effect sizes. The contribution of additional machine learning techniques - embedded feature selection and bootstrap aggregation (bagging) - to model performance was also quantified. Five machine learning regression methods - Gaussian Process Regression, Multiple Kernel Learning, Kernel Ridge Regression, the Elastic Net and Random Forest, were examined with both real and simulated MRI data, and in comparison to standard multiple regression. The different machine learning regression algorithms produced varying results, which depended on sample size, feature set size, and predictor effect size. When the effect size was large, the Elastic Net, Kernel Ridge Regression and Gaussian Process Regression performed well at most sample sizes and feature set sizes. However, when the effect size was small, only the Elastic Net made accurate predictions, but this was limited to analyses with sample sizes greater than 400. Random Forest also produced a moderate performance for small effect sizes, but could do so across all sample sizes. Machine learning techniques also improved prediction accuracy for multiple regression. These data provide empirical evidence for the differential performance of various machines on neuroimaging data, which are dependent on number of sample size, features and effect size.

KEYWORDS:

Machine learning; Neuroimaging; Regression algorithms; Reproducibility

PMID:
31173905
PMCID:
PMC6688909
[Available on 2020-10-01]
DOI:
10.1016/j.neuroimage.2019.05.082

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center