Send to

Choose Destination
J Epidemiol Community Health. 2018 Jul;72(7):564-571. doi: 10.1136/jech-2017-210061. Epub 2018 Mar 21.

A multivariate approach to investigate the combined biological effects of multiple exposures.

Author information

Department of Epidemiology and Biostatistics, School of Public Health, MRC-PHE Centre for Environment and Health, Imperial College London, London, UK.
Molecular and Genetic Epidemiology Unit, Italian Institute for Genomic Medicine (IIGM), Turin, Italy.
UMR CNRS 5142, Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l'Adour, Anglet, France.
School of Mathematics, ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia.
Institute for Risk Assessment Sciences, Division of Environmental Epidemiology, Utrecht University, Utrecht, Netherlands.
ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain.
Universitat Pompeu Fabra (UPF), Barcelona, Spain.
CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.
IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain.
Division of Computational and Systems Medicine, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK.


Epidemiological studies provide evidence that environmental exposures may affect health through complex mixtures. Formal investigation of the effect of exposure mixtures is usually achieved by modelling interactions, which relies on strong assumptions relating to the identity and the number of the exposures involved in such interactions, and on the order and parametric form of these interactions. These hypotheses become difficult to formulate and justify in an exposome context, where influential exposures are numerous and heterogeneous. To capture both the complexity of the exposome and its possibly pleiotropic effects, models handling multivariate predictors and responses, such as partial least squares (PLS) algorithms, can prove useful. As an illustrative example, we applied PLS models to data from a study investigating the inflammatory response (blood concentration of 13 immune markers) to the exposure to four disinfection by-products (one brominated and three chlorinated compounds), while swimming in a pool. To accommodate the multiple observations per participant (n=60; before and after the swim), we adopted a multilevel extension of PLS algorithms, including sparse PLS models shrinking loadings coefficients of unimportant predictors (exposures) and/or responses (protein levels). Despite the strong correlation among co-occurring exposures, our approach identified a subset of exposures (n=3/4) affecting the exhaled levels of 8 (out of 13) immune markers. PLS algorithms can easily scale to high-dimensional exposures and responses, and prove useful for exposome research to identify sparse sets of exposures jointly affecting a set of (selected) biological markers. Our descriptive work may guide these extensions for higher dimensional data.


OMICs data; exposome; multi-level sparse PLS models; multiple exposures; multivariate response

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center