Informative missingness in electronic health record systems: the curse of knowing

Diagn Progn Res. 2020 Jul 2:4:8. doi: 10.1186/s41512-020-00077-0. eCollection 2020.

Abstract

Electronic health records provide a potentially valuable data source of information for developing clinical prediction models. However, missing data are common in routinely collected health data and often missingness is informative. Informative missingness can be incorporated in a clinical prediction model, for example by including a separate category of a predictor variable that has missing values. The predictive performance of such a model depends on the transportability of the missing data mechanism, which may be compromised once the model is deployed in practice and the predictive value of certain variables becomes known. Using synthetic data, this phenomenon is explained and illustrated.

Keywords: Missing data; Prediction modelling; Routine care data.