Format

Send to

Choose Destination
BMC Genomics. 2019 Nov 4;20(1):805. doi: 10.1186/s12864-019-6192-1.

Cox regression increases power to detect genotype-phenotype associations in genomic studies using the electronic health record.

Author information

1
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA. jakejhughey@gmail.com.
2
Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA. jakejhughey@gmail.com.
3
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
4
Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
5
Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA.

Abstract

BACKGROUND:

The growth of DNA biobanks linked to data from electronic health records (EHRs) has enabled the discovery of numerous associations between genomic variants and clinical phenotypes. Nonetheless, although clinical data are generally longitudinal, standard approaches for detecting genotype-phenotype associations in such linked data, notably logistic regression, do not naturally account for variation in the period of follow-up or the time at which an event occurs. Here we explored the advantages of quantifying associations using Cox proportional hazards regression, which can account for the age at which a patient first visited the healthcare system (left truncation) and the age at which a patient either last visited the healthcare system or acquired a particular phenotype (right censoring).

RESULTS:

In comprehensive simulations, we found that, compared to logistic regression, Cox regression had greater power at equivalent Type I error. We then scanned for genotype-phenotype associations using logistic regression and Cox regression on 50 phenotypes derived from the EHRs of 49,792 genotyped individuals. Consistent with the findings from our simulations, Cox regression had approximately 10% greater relative sensitivity for detecting known associations from the NHGRI-EBI GWAS Catalog. In terms of effect sizes, the hazard ratios estimated by Cox regression were strongly correlated with the odds ratios estimated by logistic regression.

CONCLUSIONS:

As longitudinal health-related data continue to grow, Cox regression may improve our ability to identify the genetic basis for a wide range of human phenotypes.

KEYWORDS:

Cox regression; Electronic health record; GWAS; Time-to-event modeling

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center