Format

Send to

Choose Destination
Genet Epidemiol. 2016 Nov;40(7):544-557. doi: 10.1002/gepi.21997. Epub 2016 Aug 31.

Identifying significant gene-environment interactions using a combination of screening testing and hierarchical false discovery rate control.

Author information

1
Departments of Biomedical Data Science and Genetics, Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA. rob.frost@dartmouth.edu.
2
Center for Neuroimaging and Indiana Alzheimer's Disease Center, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
3
Departments of Biomedical Data Science and Genetics, Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.
4
Division of Informatics, Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6021, USA.

Abstract

Although gene-environment (G× E) interactions play an important role in many biological systems, detecting these interactions within genome-wide data can be challenging due to the loss in statistical power incurred by multiple hypothesis correction. To address the challenge of poor power and the limitations of existing multistage methods, we recently developed a screening-testing approach for G× E interaction detection that combines elastic net penalized regression with joint estimation to support a single omnibus test for the presence of G× E interactions. In our original work on this technique, however, we did not assess type I error control or power and evaluated the method using just a single, small bladder cancer data set. In this paper, we extend the original method in two important directions and provide a more rigorous performance evaluation. First, we introduce a hierarchical false discovery rate approach to formally assess the significance of individual G× E interactions. Second, to support the analysis of truly genome-wide data sets, we incorporate a score statistic-based prescreening step to reduce the number of single nucleotide polymorphisms prior to fitting the first stage penalized regression model. To assess the statistical properties of our method, we compare the type I error rate and statistical power of our approach with competing techniques using both simple simulation designs as well as designs based on real disease architectures. Finally, we demonstrate the ability of our approach to identify biologically plausible SNP-education interactions relative to Alzheimer's disease status using genome-wide association study data from the Alzheimer's Disease Neuroimaging Initiative (ADNI).

KEYWORDS:

gene-environment interactions; hierarchical FDR; penalized regression; screening testing

PMID:
27578615
PMCID:
PMC5108431
DOI:
10.1002/gepi.21997
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center