Conditional-cumulant-of-exposure method in logistic missing covariate regression

Biometrics. 2000 Mar;56(1):98-105. doi: 10.1111/j.0006-341x.2000.00098.x.

Abstract

We consider estimation in logistic regression where some covariate variables may be missing at random. Satten and Kupper (1993, Journal of the American Statistical Association 88, 200-208) proposed estimating odds ratio parameters using methods based on the probability of exposure. By approximating a partial likelihood, we extend their idea and propose a method that estimates the cumulant-generating function of the missing covariate given observed covariates and surrogates in the controls. Our proposed method first estimates some lower order cumulants of the conditional distribution of the unobserved data and then solves a resulting estimating equation for the logistic regression parameter. A simple version of the proposed method is to replace a missing covariate by the summation of its conditional mean and conditional variance given observed data in the controls. We note that one important property of the proposed method is that, when the validation is only on controls, a class of inverse selection probability weighted semiparametric estimators cannot be applied because selection probabilities on cases are zeroes. The proposed estimator performs well unless the relative risk parameters are large, even though it is technically inconsistent. Small-sample simulations are conducted. We illustrate the method by an example of real data analysis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Biometry
  • Body Mass Index
  • Epidemiologic Methods*
  • Humans
  • Logistic Models*
  • Proportional Hazards Models
  • Risk Factors
  • Smoking / adverse effects
  • Urinary Bladder Neoplasms / epidemiology
  • Urinary Bladder Neoplasms / etiology