Send to

Choose Destination
Clin Trials. 2019 Jun;16(3):273-282. doi: 10.1177/1740774519833679. Epub 2019 Mar 13.

Improving pragmatic clinical trial design using real-world data.

Author information

1 Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA.
2 Department of Biostatistics, University of Washington, Seattle, WA, USA.
3 RAND Corporation, Santa Monica, CA, USA.
4 Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA.



Pragmatic clinical trials often use automated data sources such as electronic health records, claims, or registries to identify eligible individuals and collect outcome information. A specific advantage that this automated data collection often yields is having data on potential participants when design decisions are being made. We outline how this data can be used to inform trial design.


Our work is motivated by a pragmatic clinical trial evaluating the impact of suicide-prevention outreach interventions on fatal and non-fatal suicide attempts in the 18 months after randomization. We illustrate our recommended approaches for designing pragmatic clinical trials using historical data from the health systems participating in this study. Specifically, we illustrate how electronic health record data can be used to inform the selection of trial eligibility requirements, to estimate the distribution of participant characteristics over the course of the trial, and to conduct power and sample size calculations.


Data from 122,873 people with patient health questionnaire (PHQ) responses, recorded in their electronic health records between 1 July 2010 and 31 March 2012, were used to show that the suicide attempt rate in the 18 months following completion of the questionnaire varies by response to item nine of the PHQ. We estimated that the proportion of individuals with a prior recorded elevated PHQ (i.e. history of suicidal ideation) would decrease from approximately 50% at the beginning of a trial to about 5%, 50 weeks later. Using electronic health record data, we conducted simulations to estimate the power to detect a 25% reduction in suicide attempts. Simulation-based power calculations estimated that randomizing 8000 participants per randomization arm would allow 90% power to detect a 25% reduction in the suicide attempt rate in the intervention arm compared to usual care at an alpha rate of 0.05.


Historical data can be used to inform the design of pragmatic clinical trials, a strength of trials that use automated data collection for randomizing participants and assessing outcomes. In particular, realistic sample size calculations can be conducted using real-world data from the health systems in which the trial will be conducted. Data-informed trial design should yield more realistic estimates of statistical power and maximize efficiency of trial recruitment.


Electronic medical records; mental health; power calculations; pragmatic clinical trials; randomized trial design; sample size calculations; study design; suicide prevention

[Available on 2020-06-01]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center