What Is the Test-Retest Reliability of Common Task-Functional MRI Measures? New Empirical Evidence and a Meta-Analysis

Maxwell L Elliott; Annchen R Knodt; David Ireland; Meriwether L Morris; Richie Poulton; Sandhya Ramrakha; Maria L Sison; Terrie E Moffitt; Avshalom Caspi; Ahmad R Hariri

doi:10.1177/0956797620916786

What Is the Test-Retest Reliability of Common Task-Functional MRI Measures? New Empirical Evidence and a Meta-Analysis

Psychol Sci. 2020 Jul;31(7):792-806. doi: 10.1177/0956797620916786. Epub 2020 Jun 3.

Authors

Maxwell L Elliott¹, Annchen R Knodt¹, David Ireland², Meriwether L Morris¹, Richie Poulton², Sandhya Ramrakha², Maria L Sison¹, Terrie E Moffitt^{1

3

4

5}, Avshalom Caspi^{1

3

4

5}, Ahmad R Hariri¹

Affiliations

¹ Department of Psychology & Neuroscience, Duke University.
² Dunedin Multidisciplinary Health and Development Research Unit, Department of Psychology, University of Otago.
³ Social, Genetic, & Developmental Psychiatry Research Centre, Institute of Psychiatry, Psychology, & Neuroscience, King's College London.
⁴ Department of Psychiatry & Behavioral Sciences, Duke University School of Medicine.
⁵ Center for Genomic and Computational Biology, Duke University.

Abstract

Identifying brain biomarkers of disease risk is a growing priority in neuroscience. The ability to identify meaningful biomarkers is limited by measurement reliability; unreliable measures are unsuitable for predicting clinical outcomes. Measuring brain activity using task functional MRI (fMRI) is a major focus of biomarker development; however, the reliability of task fMRI has not been systematically evaluated. We present converging evidence demonstrating poor reliability of task-fMRI measures. First, a meta-analysis of 90 experiments (N = 1,008) revealed poor overall reliability-mean intraclass correlation coefficient (ICC) = .397. Second, the test-retest reliabilities of activity in a priori regions of interest across 11 common fMRI tasks collected by the Human Connectome Project (N = 45) and the Dunedin Study (N = 20) were poor (ICCs = .067-.485). Collectively, these findings demonstrate that common task-fMRI measures are not currently suitable for brain biomarker discovery or for individual-differences research. We review how this state of affairs came to be and highlight avenues for improving task-fMRI reliability.

Keywords: cognitive neuroscience; individual differences; neuroimaging; statistical analysis.

Publication types

Meta-Analysis
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Systematic Review

MeSH terms

Brain / physiology*
Cognition / physiology*
Connectome / methods*
Humans
Individuality
Magnetic Resonance Imaging*
Reproducibility of Results

Abstract

Publication types

MeSH terms

Grants and funding