Send to

Choose Destination
PLoS Comput Biol. 2009 Jun;5(6):e1000414. doi: 10.1371/journal.pcbi.1000414. Epub 2009 Jun 19.

Identification of yeast transcriptional regulation networks using multivariate random forests.

Author information

Department of Epidemiology and Biostatistics, Center for Bioinformatics and Molecular Biostatistics, University of California, San Francisco, California, USA.


The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center