From sets to graphs: towards a realistic enrichment analysis of transcriptomic systems

Bioinformatics. 2011 Jul 1;27(13):i366-73. doi: 10.1093/bioinformatics/btr228.

Abstract

Motivation: Current gene set enrichment approaches do not take interactions and associations between set members into account. Mutual activation and inhibition causing positive and negative correlation among set members are thus neglected. As a consequence, inconsistent regulations and contextless expression changes are reported and, thus, the biological interpretation of the result is impeded.

Results: We analyzed established gene set enrichment methods and their result sets in a large-scale investigation of 1000 expression datasets. The reported statistically significant gene sets exhibit only average consistency between the observed patterns of differential expression and known regulatory interactions. We present Gene Graph Enrichment Analysis (GGEA) to detect consistently and coherently enriched gene sets, based on prior knowledge derived from directed gene regulatory networks. Firstly, GGEA improves the concordance of pairwise regulation with individual expression changes in respective pairs of regulating and regulated genes, compared with set enrichment methods. Secondly, GGEA yields result sets where a large fraction of relevant expression changes can be explained by nearby regulators, such as transcription factors, again improving on set-based methods. Thirdly, we demonstrate in additional case studies that GGEA can be applied to human regulatory pathways, where it sensitively detects very specific regulation processes, which are altered in tumors of the central nervous system. GGEA significantly increases the detection of gene sets where measured positively or negatively correlated expression patterns coincide with directed inducing or repressing relationships, thus facilitating further interpretation of gene expression data.

Availability: The method and accompanying visualization capabilities have been bundled into an R package and tied to a grahical user interface, the Galaxy workflow environment, that is running as a web server.

Contact: Ludwig.Geistlinger@bio.ifi.lmu.de; Ralf.Zimmer@bio.ifi.lmu.de.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Gene Expression Profiling*
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks
  • Humans
  • Neoplasms, Nerve Tissue / genetics*
  • Neoplasms, Nerve Tissue / metabolism*
  • Proteins / genetics
  • Signal Transduction
  • Software*

Substances

  • Proteins