Synthesizing external aggregated information in the penalized Cox regression under population heterogeneity

Stat Med. 2021 Oct 15;40(23):4915-4930. doi: 10.1002/sim.9101. Epub 2021 Jun 16.

Abstract

Synthesizing external aggregated information has been proven useful in improving estimation efficiency when conducting statistical analysis using a limited amount of data. In this paper, we develop a unified framework for combining information from high-dimensional individual-level data and potentially low-dimensional external aggregate data under the Cox model. We summarize various forms of external aggregated information by population estimating equations and propose a penalized empirical likelihood approach to borrow information from these estimating equations. The proposed methods possess the flexibility to handle the case where individual-level data and external aggregate data are from heterogeneous populations. Specifically, a penalized empirical likelihood ratio test is developed to check for the potential heterogeneity, and a semiparametric density ratio model is postulated to account for the heterogeneity. Moreover, we study the impact of uncertainty in the auxiliary information on the efficiency gain and propose a modified variance estimator to adjust for the uncertainty. The proposed estimators enjoy the oracle property and are asymptotically more efficient than the penalized partial likelihood estimator that does not exploit the external aggregated information. Simulation studies show improvement in both estimation efficiency and variable selection over the competitors. The proposed approaches are applied to the analysis of a pediatric kidney transplant study for illustration.

Keywords: empirical likelihood; information synthesis; meta-analysis; population heterogeneity; regularized likelihood.

MeSH terms

  • Child
  • Computer Simulation
  • Humans
  • Likelihood Functions
  • Proportional Hazards Models
  • Research Design*
  • Uncertainty