Format

Send to

Choose Destination
Comput Biol Chem. 2016 Aug;63:62-72. doi: 10.1016/j.compbiolchem.2016.01.014. Epub 2016 Feb 13.

MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

Author information

1
Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, 277-8568 Chiba, Japan. Electronic address: haruka.ozaki@riken.jp.
2
Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, 277-8568 Chiba, Japan; Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, 113-0032 Tokyo, Japan; Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, 277-8564 Chiba, Japan. Electronic address: iwasaki@bs.s.u-tokyo.ac.jp.

Abstract

BACKGROUND:

As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif.

RESULTS:

Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs.

CONCLUSIONS:

By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions.

KEYWORDS:

ChIP-Seq; DNA binding motifs; Transcription factors

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center