Criteria for evaluating risk prediction of multiple outcomes

Stat Methods Med Res. 2020 Dec;29(12):3492-3510. doi: 10.1177/0962280220929039. Epub 2020 Jun 29.

Abstract

Risk prediction models have been developed in many contexts to classify individuals according to a single outcome, such as risk of a disease. Emerging "-omic" biomarkers provide panels of features that can simultaneously predict multiple outcomes from a single biological sample, creating issues of multiplicity reminiscent of exploratory hypothesis testing. Here I propose definitions of some basic criteria for evaluating prediction models of multiple outcomes. I define calibration in the multivariate setting and then distinguish between outcome-wise and individual-wise prediction, and within the latter between joint and panel-wise prediction. I give examples such as screening and early detection in which different senses of prediction may be more appropriate. In each case I propose definitions of sensitivity, specificity, concordance, positive and negative predictive value and relative utility. I link the definitions through a multivariate probit model, showing that the accuracy of a multivariate prediction model can be summarised by its covariance with a liability vector. I illustrate the concepts on a biomarker panel for early detection of eight cancers, and on polygenic risk scores for six common diseases.

Keywords: Risk prediction; biomarkers; multiplicity; multivariate analysis; polygenic risk score; screening.

MeSH terms

  • Biomarkers
  • Humans
  • Neoplasms*
  • Predictive Value of Tests

Substances

  • Biomarkers