Genome-wide conserved consensus transcription factor binding motifs are hyper-methylated

BMC Genomics. 2010 Sep 27:11:519. doi: 10.1186/1471-2164-11-519.

Abstract

Background: DNA methylation can regulate gene expression by modulating the interaction between DNA and proteins or protein complexes. Conserved consensus motifs exist across the human genome ("predicted transcription factor binding sites": "predicted TFBS") but the large majority of these are proven by chromatin immunoprecipitation and high throughput sequencing (ChIP-seq) not to be biological transcription factor binding sites ("empirical TFBS"). We hypothesize that DNA methylation at conserved consensus motifs prevents promiscuous or disorderly transcription factor binding.

Results: Using genome-wide methylation maps of the human heart and sperm, we found that all conserved consensus motifs as well as the subset of those that reside outside CpG islands have an aggregate profile of hyper-methylation. In contrast, empirical TFBS with conserved consensus motifs have a profile of hypo-methylation. 40% of empirical TFBS with conserved consensus motifs resided in CpG islands whereas only 7% of all conserved consensus motifs were in CpG islands. Finally we further identified a minority subset of TF whose profiles are either hypo-methylated or neutral at their respective conserved consensus motifs implicating that these TF may be responsible for establishing or maintaining an un-methylated DNA state, or whose binding is not regulated by DNA methylation.

Conclusions: Our analysis supports the hypothesis that at least for a subset of TF, empirical binding to conserved consensus motifs genome-wide may be controlled by DNA methylation.

MeSH terms

  • Base Sequence
  • Binding Sites
  • Consensus Sequence / genetics*
  • CpG Islands / genetics
  • DNA Methylation / genetics*
  • Databases, Genetic
  • Genome, Human / genetics*
  • Humans
  • Male
  • Myocardium / metabolism
  • Protein Binding
  • Spermatozoa / metabolism
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors