Send to

Choose Destination
Stat Methods Med Res. 2018 May;27(5):1331-1350. doi: 10.1177/0962280216660128. Epub 2016 Jul 26.

Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies.

Author information

1 Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada.
2 Lady Davis Institute for Medical Research, Montréal, Quebec, Canada.
3 Ludmer Centre for Neuroinformatics and Mental Health, Montréal, Quebec, Canada.
4 Département de mathématiques, Université du Québec à Montréal, Montreal, Quebec, Canada.
5 Arctic Diagnostics Inc., Toronto, Ontario, Canada.
6 Translational Neuroimaging Laboratory, McGill Center for Studies in Aging, Montreal, Quebec, Canada.
7 Department of Psychiatry, McGill University, Montreal, Quebec, Canada.
8 Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada.
11 Douglas Mental Health University Institute, Montreal, Quebec, Canada.
9 Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
10 Department of Oncology, McGill University, Montreal, Quebec, Canada.


The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.


DNA methylation; Dimension reduction; brain imaging; exact test; multivariate analysis; region-based analysis


Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center