Logistic analysis of studies with two-stage sampling: a comparison of four approaches

Stat Med. 1997;16(1-3):117-32. doi: 10.1002/(sici)1097-0258(19970130)16:2<117::aid-sim475>3.0.co;2-5.

Abstract

This paper discusses the analysis of two-stage studies where covariates are missing or measured with error at the first stage of sampling and are validated at the second stage in a subsample. Four recently developed approaches, the weighted pseudo-likelihood method of Flanders and Greenland (1991), the pseudo-conditional likelihood methods of Breslow and Cain (1988) and Schill et al. (1993) and the maximum likelihood estimate obtained via the EM-algorithm (Wacholder and Weinberg, 1994) are reviewed, and some connections between them are established. It is shown that, with respect to odds ratio estimation, case-control designs can be analysed as if first-stage sampling had been prospective. The procedures are numerically compared with respect to asymptotic relative efficiency in a missing value setting.

MeSH terms

  • Algorithms
  • Case-Control Studies
  • Humans
  • Incidence
  • Likelihood Functions
  • Logistic Models*
  • Lung Neoplasms / epidemiology
  • Lung Neoplasms / etiology
  • Odds Ratio
  • Retrospective Studies
  • Risk Factors
  • Sampling Studies
  • Smoking / adverse effects