Format

Send to

Choose Destination
See comment in PubMed Commons below
Nucleic Acids Res. 2015 Apr 30;43(8):3998-4012. doi: 10.1093/nar/gkv195. Epub 2015 Mar 19.

Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism.

Author information

  • 1Department of Computer Science, University of Illinois, Urbana, IL 61801, USA.
  • 2National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA.
  • 3Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01655, USA Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA.
  • 4Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01655, USA Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
  • 5Department of Computer Science, University of Illinois, Urbana, IL 61801, USA Institute of Genomic Biology, University of Illinois, Urbana, IL 61801, USA sinhas@illinois.edu.

Abstract

Characterization of cell type specific regulatory networks and elements is a major challenge in genomics, and emerging strategies frequently employ high-throughput genome-wide assays of transcription factor (TF) to DNA binding, histone modifications or chromatin state. However, these experiments remain too difficult/expensive for many laboratories to apply comprehensively to their system of interest. Here, we explore the potential of elucidating regulatory systems in varied cell types using computational techniques that rely on only data of gene expression, low-resolution chromatin accessibility, and TF-DNA binding specificities ('motifs'). We show that static computational motif scans overlaid with chromatin accessibility data reasonably approximate experimentally measured TF-DNA binding. We demonstrate that predicted binding profiles and expression patterns of hundreds of TFs are sufficient to identify major regulators of ∼200 spatiotemporal expression domains in the Drosophila embryo. We are then able to learn reliable statistical models of enhancer activity for over 70 expression domains and apply those models to annotate domain specific enhancers genome-wide. Throughout this work, we apply our motif and accessibility based approach to comprehensively characterize the regulatory network of fruitfly embryonic development and show that the accuracy of our computational method compares favorably to approaches that rely on data from many experimental assays.

PMID:
25791631
PMCID:
PMC4417154
DOI:
10.1093/nar/gkv195
[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Support Center