Format

Send to

Choose Destination
See comment in PubMed Commons below
Biometrics. 2010 Sep;66(3):793-804. doi: 10.1111/j.1541-0420.2009.01341.x.

Pairwise variable selection for high-dimensional model-based clustering.

Author information

1
Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA.

Abstract

Variable selection for clustering is an important and challenging problem in high-dimensional data analysis. Existing variable selection methods for model-based clustering select informative variables in a "one-in-all-out" manner; that is, a variable is selected if at least one pair of clusters is separable by this variable and removed if it cannot separate any of the clusters. In many applications, however, it is of interest to further establish exactly which clusters are separable by each informative variable. To address this question, we propose a pairwise variable selection method for high-dimensional model-based clustering. The method is based on a new pairwise penalty. Results on simulated and real data show that the new method performs better than alternative approaches that use ℓ(1) and ℓ(∞) penalties and offers better interpretation.

PMID:
19912170
PMCID:
PMC2888949
DOI:
10.1111/j.1541-0420.2009.01341.x
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Wiley Icon for PubMed Central
    Loading ...
    Support Center