Improved recovery of cell-cycle gene expression in Saccharomyces cerevisiae from regulatory interactions in multiple omics data

BMC Genomics. 2020 Feb 13;21(1):159. doi: 10.1186/s12864-020-6554-8.

Abstract

Background: Gene expression is regulated by DNA-binding transcription factors (TFs). Together with their target genes, these factors and their interactions collectively form a gene regulatory network (GRN), which is responsible for producing patterns of transcription, including cyclical processes such as genome replication and cell division. However, identifying how this network regulates the timing of these patterns, including important interactions and regulatory motifs, remains a challenging task.

Results: We employed four in vivo and in vitro regulatory data sets to investigate the regulatory basis of expression timing and phase-specific patterns cell-cycle expression in Saccharomyces cerevisiae. Specifically, we considered interactions based on direct binding between TF and target gene, indirect effects of TF deletion on gene expression, and computational inference. We found that the source of regulatory information significantly impacts the accuracy and completeness of recovering known cell-cycle expressed genes. The best approach involved combining TF-target and TF-TF interactions features from multiple datasets in a single model. In addition, TFs important to multiple phases of cell-cycle expression also have the greatest impact on individual phases. Important TFs regulating a cell-cycle phase also tend to form modules in the GRN, including two sub-modules composed entirely of unannotated cell-cycle regulators (STE12-TEC1 and RAP1-HAP1-MSN4).

Conclusion: Our findings illustrate the importance of integrating both multiple omics data and regulatory motifs in order to understand the significance regulatory interactions involved in timing gene expression. This integrated approached allowed us to recover both known cell-cycles interactions and the overall pattern of phase-specific expression across the cell-cycle better than any single data set. Likewise, by looking at regulatory motifs in the form of TF-TF interactions, we identified sets of TFs whose co-regulation of target genes was important for cell-cycle expression, even when regulation by individual TFs was not. Overall, this demonstrates the power of integrating multiple data sets and models of interaction in order to understand the regulatory basis of established biological processes and their associated gene regulatory networks.

Keywords: Computational biology; Gene expression; Gene regulation; Machine learning; Modeling.

MeSH terms

  • Computational Biology / methods
  • Gene Expression Regulation, Fungal*
  • Gene Regulatory Networks*
  • Genes, cdc*
  • Genomics* / methods
  • Machine Learning
  • Protein Binding
  • Protein Interaction Mapping
  • Protein Interaction Maps
  • Saccharomyces cerevisiae / genetics*
  • Saccharomyces cerevisiae / metabolism
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism
  • Transcription Factors / metabolism

Substances

  • Saccharomyces cerevisiae Proteins
  • Transcription Factors