Format

Send to

Choose Destination
See comment in PubMed Commons below
Biostatistics. 2014 Oct;15(4):719-30. doi: 10.1093/biostatistics/kxu023. Epub 2014 Jun 6.

Improving upon the efficiency of complete case analysis when covariates are MNAR.

Author information

1
Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK jonathan.bartlett@lshtm.ac.uk.
2
Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK and MRC Clinical Trial Trials Unit, Kingsway, London WC2B 6NH, UK.
3
School of Social and Community Medicine, University of Bristol, Canynge Hall, 39 Whatley Road, Bristol BS8 2PS, UK.
4
Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan, 281 S9, B-9000 Ghent, Belgium.

Abstract

Missing values in covariates of regression models are a pervasive problem in empirical research. Popular approaches for analyzing partially observed datasets include complete case analysis (CCA), multiple imputation (MI), and inverse probability weighting (IPW). In the case of missing covariate values, these methods (as typically implemented) are valid under different missingness assumptions. In particular, CCA is valid under missing not at random (MNAR) mechanisms in which missingness in a covariate depends on the value of that covariate, but is conditionally independent of outcome. In this paper, we argue that in some settings such an assumption is more plausible than the missing at random assumption underpinning most implementations of MI and IPW. When the former assumption holds, although CCA gives consistent estimates, it does not make use of all observed information. We therefore propose an augmented CCA approach which makes the same conditional independence assumption for missingness as CCA, but which improves efficiency through specification of an additional model for the probability of missingness, given the fully observed variables. The new method is evaluated using simulations and illustrated through application to data on reported alcohol consumption and blood pressure from the US National Health and Nutrition Examination Survey, in which data are likely MNAR independent of outcome.

KEYWORDS:

Complete case analysis; Missing covariates; Missing not at random; Multiple imputation

PMID:
24907708
PMCID:
PMC4173105
DOI:
10.1093/biostatistics/kxu023
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems Icon for PubMed Central
    Loading ...
    Support Center