Efficiency of regression estimates for clustered data

Biometrics. 1996 Jun;52(2):500-11.

Abstract

Statistical methods for clustered data, such as generalized estimating equations (GEE) and generalized least squares (GLS), require selecting a correlation or convariance structure to specify the dependence between observations within a cluster. Valid regression estimates can be obtained that do not depend on correct specification of the true correlation, but inappropriate specifications can result in a loss of efficiency. We derive general expressions for the asymptotic relative efficiency of GEE and GLS estimators under nested correlation structures. Efficiency is shown to depend on the covariate distribution, the cluster sizes, the response variable correlation, and the regression parameters. The results demonstrate that efficiency is quite sensitive to the between- and within-cluster variation of the covariates, and provide useful characterizations of models for which upper and lower efficiency bounds are attained. Efficiency losses for simple working correlation matrices, such as independence, can be large even for small to moderate correlations and cluster sizes.

MeSH terms

  • Anti-Inflammatory Agents, Non-Steroidal / therapeutic use
  • Clinical Trials as Topic
  • Cluster Analysis*
  • Humans
  • Models, Statistical
  • Periodontal Diseases / diagnosis
  • Periodontal Diseases / drug therapy
  • Regression Analysis*

Substances

  • Anti-Inflammatory Agents, Non-Steroidal