Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Science. 2011 Dec 16;334(6062):1518-24. doi: 10.1126/science.1205438.

Detecting novel associations in large data sets.

Author information

  • 1Department of Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. dnreshef@mit.edu

Abstract

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R(2)) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

Comment in

PMID:
22174245
[PubMed - indexed for MEDLINE]
PMCID:
PMC3325791
Free PMC Article

Images from this publication.See all images (6)Free text

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk