Send to

Choose Destination
Epidemiology. 2016 Nov;27(6):870-8. doi: 10.1097/EDE.0000000000000548.

Classification and Clustering Methods for Multiple Environmental Factors in Gene-Environment Interaction: Application to the Multi-Ethnic Study of Atherosclerosis.

Author information

From the aDepartment of Biostatistics and Bioinformatics, Emory University, Atlanta, GA; bDepartment of Biostatistics, University of Michigan, Ann Arbor, MI; cDepartment of Epidemiology, University of Michigan, Ann Arbor, MI; dDepartment of Family Medicine and Public Health, University of California San Diego, La Jolla, CA; and eDepartment of Epidemiology and Biostatistics, Dornsife School of Public Health at Drexel University, Philadelphia, PA.


There has been an increased interest in identifying gene-environment interaction (G × E) in the context of multiple environmental exposures. Most G × E studies analyze one exposure at a time, but we are exposed to multiple exposures in reality. Efficient analysis strategies for complex G × E with multiple environmental factors in a single model are still lacking. Using the data from the Multiethnic Study of Atherosclerosis, we illustrate a two-step approach for modeling G × E with multiple environmental factors. First, we utilize common clustering and classification strategies (e.g., k-means, latent class analysis, classification and regression trees, Bayesian clustering using Dirichlet Process) to define subgroups corresponding to distinct environmental exposure profiles. Second, we illustrate the use of an additive main effects and multiplicative interaction model, instead of the conventional saturated interaction model using product terms of factors, to study G × E with the data-driven exposure subgroups defined in the first step. We demonstrate useful analytical approaches to translate multiple environmental exposures into one summary class. These tools not only allow researchers to consider several environmental exposures in G × E analysis but also provide some insight into how genes modify the effect of a comprehensive exposure profile instead of examining effect modification for each exposure in isolation.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Wolters Kluwer Icon for PubMed Central
Loading ...
Support Center