Send to

Choose Destination
Am J Epidemiol. 2019 Jun 5. pii: kwz127. doi: 10.1093/aje/kwz127. [Epub ahead of print]

Two-phase, generalized case-control designs for quantitative longitudinal outcomes.

Author information

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN.
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA.
Division of Nephrology, Department of Medicine, University of Washington, Seattle, WA.
Division of Intramural Population Health Research, Eunice Kennedy National Institute of Child Health and Human Development.
Department of Radiology, Massachusetts General Hospital, Boston, MA.
Department of Population Health, University of Texas Dell Medical School, Austin, TX.
Department of Biostatistics, University of Washington School of Public Health, Seattle, WA.


We propose a general class of two-phase, epidemiological study designs for quantitative, longitudinal data that are useful when phase one longitudinal outcome and covariate data are available, but the exposure (e.g., biomarker) can only be collected on a subset of subjects during phase two. To conduct a design in the class, one first summarizes the longitudinal outcomes by fitting a simple linear regression of the response on a time-varying covariate for each subject. Sampling strata are defined by splitting the estimated regression intercept or slope distributions into distinct (low, medium, and high) regions. Stratified sampling is then conducted from strata defined by the intercepts, slopes, or from a mixture. In general, samples selected with extreme intercept values will yield low variances for time-fixed exposure associations with the outcome and samples enriched with extreme slope values will yield low variances for time-varying exposure associations with the outcome (including interactions with time-varying exposures). We describe ascertainment corrected maximum likelihood and multiple imputation estimation procedures that permit valid and efficient inferences. We embed all methodological developments within the framework of conducting a sub-study that seeks to examine genetic associations with lung function in continuously smoking participants in the Lung Health Study.


ascertainment corrected likelihood; case-control study; conditional likelihood; linear mixed models; longitudinal data; multiple imputation; outcome dependent sampling; response selective sampling


Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center