Send to

Choose Destination
See comment in PubMed Commons below
Stat Med. 2001 Jan 15;20(1):139-160.

Efficient regression calibration for logistic regression in main study/internal validation study designs with an imperfect reference instrument.

Author information

  • 1Departments of Epidemiology and Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA.


An extension to the version of the regression calibration estimator proposed by Rosner et al. for logistic and other generalized linear regression models is given for main study/internal validation study designs. This estimator combines the information about the parameter of interest contained in the internal validation study with Rosner et al.'s regression calibration estimate, using a generalized inverse-variance weighted average. It is shown that the validation study selection model can be ignored as long as this model is jointly independent of the outcome and the incompletely observed covariates, conditional, at most, upon the surrogates and other completely observed covariates. In an extensive simulation study designed to follow a complex, multivariate setting in nutritional epidemiology, it is shown that with validation study sizes of 340 or more, this estimator appears to be asymptotically optimal in the sense that it is nearly unbiased and nearly as efficient as a properly specified maximum likelihood estimator. A modification to the regression calibration variance estimator which replaces the standard uncorrected logistic regression coefficient variance with the sandwich estimator to account for the possible misspecification of the logistic regression fit to the surrogate covariates in the main study, was also studied in this same simulation experiment. In this study, the alternative variance formula yielded results virtually identical to the original formula. A version of the proposed estimator is also derived for the case where the reference instrument, available only in the validation study, is imperfect but unbiased at the individual level and contains error that is uncorrelated with other covariates and with error in the surrogate instrument. Replicate measures are obtained in a subset of study participants. In this case it is shown that the validation study selection model can be ignored when sampling into the validation study depends, at most, only upon perfectly measured covariates. Two data sets, a study of fever in relation to occupational exposure to antineoplastics among hospital pharmacists and a study of breast cancer incidence in relation to dietary intakes of alcohol and vitamin A, adjusted for total energy intake, from the Nurses' Health Study, were analysed using these new methods. In these data, because the validation studies contained less than 200 observations and the events of interest were relatively rare, as is typical, the potential improvements offered by this new estimator were not apparent.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center