Send to

Choose Destination
Cancer Epidemiol Biomarkers Prev. 2016 Oct;25(10):1392-1401. Epub 2016 Jul 20.

The Cancer Epidemiology Descriptive Cohort Database: A Tool to Support Population-Based Interdisciplinary Research.

Author information

Epidemiology and Genomics Research Program, Division of Cancer Control and Population Sciences, NCI, NIH, Rockville, Maryland.
Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia.
Department of Medicine, Stanford University, Stanford, California. Department of Health Research and Policy, Stanford University, Stanford, California. Department of Statistics, Stanford University, Stanford, California. Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California.
Westat, Rockville, Maryland.
Office of Epidemiology and Research, Maternal and Child Health Bureau, Health Resources and Services Administration, Rockville, Maryland.
Division of Cancer Control and Population Sciences, NCI, NIH, Rockville, Maryland.



We report on the establishment of a web-based Cancer Epidemiology Descriptive Cohort Database (CEDCD). The CEDCD's goals are to enhance awareness of resources, facilitate interdisciplinary research collaborations, and support existing cohorts for the study of cancer-related outcomes.


Comprehensive descriptive data were collected from large cohorts established to study cancer as primary outcome using a newly developed questionnaire. These included an inventory of baseline and follow-up data, biospecimens, genomics, policies, and protocols. Additional descriptive data extracted from publicly available sources were also collected. This information was entered in a searchable and publicly accessible database. We summarized the descriptive data across cohorts and reported the characteristics of this resource.


As of December 2015, the CEDCD includes data from 46 cohorts representing more than 6.5 million individuals (29% ethnic/racial minorities). Overall, 78% of the cohorts have collected blood at least once, 57% at multiple time points, and 46% collected tissue samples. Genotyping has been performed by 67% of the cohorts, while 46% have performed whole-genome or exome sequencing in subsets of enrolled individuals. Information on medical conditions other than cancer has been collected in more than 50% of the cohorts. More than 600,000 incident cancer cases and more than 40,000 prevalent cases are reported, with 24 cancer sites represented.


The CEDCD assembles detailed descriptive information on a large number of cancer cohorts in a searchable database.


Information from the CEDCD may assist the interdisciplinary research community by facilitating identification of well-established population resources and large-scale collaborative and integrative research. Cancer Epidemiol Biomarkers Prev; 25(10); 1392-401. ©2016 AACR.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center