Format

Send to

Choose Destination
PLoS Biol. 2018 Dec 10;16(12):e3000070. doi: 10.1371/journal.pbio.3000070. eCollection 2018 Dec.

Analysis validation has been neglected in the Age of Reproducibility.

Author information

1
Northeastern University Marine Science Center, Northeastern University, Boston, Massachusetts, United States of America.
2
Institute for Biomedical Informatics, Division of Informatics, Department of Biostatistics, Epidemiology, & Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
3
Department of Biology and Marine Biology, University of North Carolina Wilmington, Wilmington, North Carolina, United States of America.

Abstract

Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique patterns of nonindependence in every biological dataset. We advocate that analyses should be evaluated with known-truth simulations that capture biological reality, a process we call "analysis validation." We review the process of validation and suggest criteria that a validation project should meet. We find that different fields of science have historically failed to meet all criteria, and we suggest ways to implement meaningful validation in training and practice.

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center