Send to

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2006 Jul 15;22(14):e108-16.

Analysis of sample set enrichment scores: assaying the enrichment of sets of genes for individual samples in genome-wide expression profiles.

Author information

Institute for Genome Sciences & Policy, Duke University, Durham, NC 27708, USA.



Gene expression profiling experiments in cell lines and animal models characterized by specific genetic or molecular perturbations have yielded sets of genes annotated by the perturbation. These gene sets can serve as a reference base for interrogating other expression datasets. For example, a new dataset in which a specific pathway gene set appears to be enriched, in terms of multiple genes in that set evidencing expression changes, can then be annotated by that reference pathway. We introduce in this paper a formal statistical method to measure the enrichment of each sample in an expression dataset. This allows us to assay the natural variation of pathway activity in observed gene expression data sets from clinical cancer and other studies.


Validation of the method and illustrations of biological insights gleaned are demonstrated on cell line data, mouse models, and cancer-related datasets. Using oncogenic pathway signatures, we show that gene sets built from a model system are indeed enriched in the model system. We employ ASSESS for the use of molecular classification by pathways. This provides an accurate classifier that can be interpreted at the level of pathways instead of individual genes. Finally, ASSESS can be used for cross-platform expression models where data on the same type of cancer are integrated over different platforms into a space of enrichment scores.


Versions are available in Octave and Java (with a graphical user interface). Software can be downloaded at

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons


    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems
    Loading ...
    Support Center