Bayesian sparse hidden components analysis for transcription regulation networks

Bioinformatics. 2006 Mar 15;22(6):739-46. doi: 10.1093/bioinformatics/btk017. Epub 2005 Dec 20.

Abstract

Motivation: In systems like Escherichia Coli, the abundance of sequence information, gene expression array studies and small scale experiments allows one to reconstruct the regulatory network and to quantify the effects of transcription factors on gene expression. However, this goal can only be achieved if all information sources are used in concert.

Results: Our method integrates literature information, DNA sequences and expression arrays. A set of relevant transcription factors is defined on the basis of literature. Sequence data are used to identify potential target genes and the results are used to define a prior distribution on the topology of the regulatory network. A Bayesian hidden component model for the expression array data allows us to identify which of the potential binding sites are actually used by the regulatory proteins in the studied cell conditions, the strength of their control, and their activation profile in a series of experiments. We apply our methodology to 35 expression studies in E.Coli with convincing results.

Availability: www.genetics.ucla.edu/labs/sabatti/software.html

Supplementary information: The supplementary material are available at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Bayes Theorem
  • Computer Simulation
  • Escherichia coli / physiology*
  • Escherichia coli Proteins / metabolism*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / physiology*
  • Humans
  • Models, Biological*
  • Oligonucleotide Array Sequence Analysis / methods
  • Signal Transduction / physiology*
  • Transcription Factors / metabolism*

Substances

  • Escherichia coli Proteins
  • Transcription Factors