Send to

Choose Destination
Pharmacoepidemiol Drug Saf. 2016 Apr;25(4):453-61. doi: 10.1002/pds.3983. Epub 2016 Feb 15.

Evaluation of propensity scores, disease risk scores, and regression in confounder adjustment for the safety of emerging treatment with group sequential monitoring.

Author information

Institute for Health Research, Kaiser Permanente Colorado, Denver, CO, USA.
Biostatistics Unit, Group Health Research Institute, Seattle, WA, USA.
Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA.
Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, USA.
Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, USA.
Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA.



The objective of this study was to evaluate regression, matching, and stratification on propensity score (PS) or disease risk score (DRS) in a setting of sequential analyses where statistical hypotheses are tested multiple times.


In a setting of sequential analyses, we simulated incident users and binary outcomes with different confounding strength, outcome incidence, and the adoption rate of treatment. We compared Type I error rate, empirical power, and time to signal using the following confounder adjustments: (i) regression; (ii) treatment matching (1:1 or 1:4) on PS or DRS; and (iii) stratification on PS or DRS. We estimated PS and DRS using lookwise and cumulative methods (all data up to the current look). We applied these confounder adjustments in examining the association between non-steroidal anti-inflammatory drugs and bleeding.


Propensity score and DRS methods had similar empirical power and time to signal. However, DRS methods yielded Type I error rates up to 17% for 1:4 matching and 15.3% for stratification methods when treatment and outcome were common and confounding strength with treatment was stronger. When treatment and outcome were not common, stratification on PS and DRS and regression yielded 8-10% Type I error rates and inflated empirical power. However, when outcome and treatment were common, both regression and stratification on PS outperformed other matching methods with Type I error rates close to 5%.


We suggest regression and stratification on PS when the outcomes and/or treatment is common and use of matching on PS with higher ratios when outcome or treatment is rare or moderately rare.


disease risk score; group sequential analyses; matching; pharmacoepidemiology; propensity score; stratification

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Wiley Icon for PubMed Central
Loading ...
Support Center