Format

Send to

Choose Destination
Front Genet. 2016 Jun 7;7:102. doi: 10.3389/fgene.2016.00102. eCollection 2016.

Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach.

Author information

1
Department of Neurology, Max Planck Institute for Human Cognitive and Brain SciencesLeipzig, Germany; IFB Adiposity Diseases, Leipzig University Medical CenterLeipzig, Germany.
2
Department of Neurology, Max Planck Institute for Human Cognitive and Brain SciencesLeipzig, Germany; IFB Adiposity Diseases, Leipzig University Medical CenterLeipzig, Germany; Collaborative Research Center 1052-A5, University of LeipzigLeipzig, Germany.
3
Department of Neurology, Max Planck Institute for Human Cognitive and Brain SciencesLeipzig, Germany; Department of Psychology, Dresden University of TechnologyDresden, Germany.
4
IFB Adiposity Diseases, Leipzig University Medical Center Leipzig, Germany.
5
Hospital for Endocrinology and Nephrology, University Hospital Leipzig Leipzig, Germany.
6
Norwegian Centre for Mental Disorders Research (NORMENT), KG Jebsen Centre for Psychosis Research, University Hospital OsloOslo, Norway; Department of Psychology, University of OsloOslo, Norway.
7
Norwegian Centre for Mental Disorders Research (NORMENT), KG Jebsen Centre for Psychosis Research, University Hospital Oslo Oslo, Norway.
8
IFB Adiposity Diseases, Leipzig University Medical CenterLeipzig, Germany; Hospital for Endocrinology and Nephrology, University Hospital LeipzigLeipzig, Germany.
9
Department of Neurology, Max Planck Institute for Human Cognitive and Brain SciencesLeipzig, Germany; IFB Adiposity Diseases, Leipzig University Medical CenterLeipzig, Germany; Clinic for Cognitive Neurology, University Hospital LeipzigLeipzig, Germany; Mind and Brain Institute, Berlin School of Mind and Brain, Humboldt-University and CharitéBerlin, Germany.

Abstract

In recent years, the advent of great technological advances has produced a wealth of very high-dimensional data, and combining high-dimensional information from multiple sources is becoming increasingly important in an extending range of scientific disciplines. Partial Least Squares Correlation (PLSC) is a frequently used method for multivariate multimodal data integration. It is, however, computationally expensive in applications involving large numbers of variables, as required, for example, in genetic neuroimaging. To handle high-dimensional problems, dimension reduction might be implemented as pre-processing step. We propose a new approach that incorporates Random Projection (RP) for dimensionality reduction into PLSC to efficiently solve high-dimensional multimodal problems like genotype-phenotype associations. We name our new method PLSC-RP. Using simulated and experimental data sets containing whole genome SNP measures as genotypes and whole brain neuroimaging measures as phenotypes, we demonstrate that PLSC-RP is drastically faster than traditional PLSC while providing statistically equivalent results. We also provide evidence that dimensionality reduction using RP is data type independent. Therefore, PLSC-RP opens up a wide range of possible applications. It can be used for any integrative analysis that combines information from multiple sources.

KEYWORDS:

Partial Least Squares Correlation; dimensionality reduction; genetic neuroimaging; genome-wide association; multivariate multimodal data integration

Supplemental Content

Full text links

Icon for Frontiers Media SA Icon for PubMed Central
Loading ...
Support Center