Cluster-Rasch models for microarray gene expression data

Genome Biol. 2001;2(8):RESEARCH0031. doi: 10.1186/gb-2001-2-8-research0031. Epub 2001 Jul 31.

Abstract

Background: We propose two different formulations of the Rasch statistical models to the problem of relating gene expression profiles to the phenotypes. One formulation allows us to investigate whether a cluster of genes with similar expression profiles is related to the observed phenotypes; this model can also be used for future prediction. The other formulation provides an alternative way of identifying genes that are over- or underexpressed from their expression levels in tissue or cell samples of a given tissue or cell type.

Results: We illustrate the methods on available datasets of a classification of acute leukemias and of 60 cancer cell lines. For tumor classification, the results are comparable to those previously obtained. For the cancer cell lines dataset, we found four clusters of genes that are related to drug response for many of the 90 drugs that we considered. In addition, for each type of cell line, we identified genes that are over- or underexpressed relative to other genes.

Conclusions: The cluster-Rasch model provides a probabilistic model for describing gene expression patterns across samples and can be used to relate gene expression profiles to phenotypes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Acute Disease
  • Antigens, Neoplasm / genetics
  • Breast Neoplasms / genetics
  • Cluster Analysis
  • Databases, Nucleic Acid
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / drug effects
  • Humans
  • Leukemia / genetics*
  • Leukemia, Myeloid / genetics
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Phenotype
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / genetics
  • Probability
  • Regression Analysis
  • Tumor Cells, Cultured

Substances

  • Antigens, Neoplasm