Format

Send to

Choose Destination
Bioinformatics. 2009 Feb 1;25(3):401-5. doi: 10.1093/bioinformatics/btn634. Epub 2008 Dec 10.

Matrix correlations for high-dimensional data: the modified RV-coefficient.

Author information

1
Biosystems Data Analysis, Swammerdam Institute for Life Sciences, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands. a.k.smilde@uva.nl

Abstract

MOTIVATION:

Modern functional genomics generates high-dimensional datasets. It is often convenient to have a single simple number characterizing the relationship between pairs of such high-dimensional datasets in a comprehensive way. Matrix correlations are such numbers and are appealing since they can be interpreted in the same way as Pearson's correlations familiar to biologists. The high-dimensionality of functional genomics data is, however, problematic for existing matrix correlations. The motivation of this article is 2-fold: (i) we introduce the idea of matrix correlations to the bioinformatics community and (ii) we give an improvement of the most promising matrix correlation coefficient (the RV-coefficient) circumventing the problems of high-dimensional data.

RESULTS:

The modified RV-coefficient can be used in high-dimensional data analysis studies as an easy measure of common information of two datasets. This is shown by theoretical arguments, simulations and applications to two real-life examples from functional genomics, i.e. a transcriptomics and metabolomics example.

AVAILABILITY:

The Matlab m-files of the methods presented can be downloaded from http://www.bdagroup.nl.

PMID:
19073588
DOI:
10.1093/bioinformatics/btn634
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center