Send to

Choose Destination
EGEMS (Wash DC). 2018 Jun 1;6(1):13. doi: 10.5334/egems.234.

Architecture and Implementation of a Clinical Research Data Warehouse for Prostate Cancer.

Author information

Department of Biomedical Informatics, Stanford University, US.
School of Medicine Research Information Technology, Stanford University, US.
Stanford Cancer Institute, Department of Medicine, Stanford University, US.
Department of Urology, Stanford University, US.
Department of Medicine, Biomedical Informatics, Stanford University, US.



Electronic health record (EHR) based research in oncology can be limited by missing data and a lack of structured data elements. Clinical research data warehouses for specific cancer types can enable the creation of more robust research cohorts.


We linked data from the Stanford University EHR with the Stanford Cancer Institute Research Database (SCIRDB) and the California Cancer Registry (CCR) to create a research data warehouse for prostate cancer. The database was supplemented with information from clinical trials, natural language processing of clinical notes and surveys on patient-reported outcomes.


11,898 unique prostate cancer patients were identified in the Stanford EHR, of which 3,936 were matched to the Stanford cancer registry and 6153 in the CCR. 7158 patients with EHR data and at least one of SCIRDB and CCR data were initially included in the warehouse.


A disease-specific clinical research data warehouse combining multiple data sources can facilitate secondary data use and enhance observational research in oncology.


Data Collection; Electronic Health Records; Quality Improvement

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center