Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Genome Res. 2002 Oct;12(10):1574-81.

Judging the quality of gene expression-based clustering methods using gene annotation.

Author information

  • 1Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA.


We compare several commonly used expression-based gene clustering algorithms using a figure of merit based on the mutual information between cluster membership and known gene attributes. By studying various publicly available expression data sets we conclude that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers. As a measure of dissimilarity between the expression patterns of two genes, no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance for non-ratio-based measurements at the optimal choice of cluster number. We show the self-organized-map approach to be best for both measurement types at higher numbers of clusters. Clusters of genes derived from single- and average-linkage hierarchical clustering tend to produce worse-than-random results.

[PubMed - indexed for MEDLINE]
Free PMC Article

Images from this publication.See all images (5)Free text

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk