We have developed a high throughput, next-generation DNA sequencing assay for rapid transcription factor binding site (TFBS) discovery in a genomic context. DNA affinity purification sequencing (DAP-seq), which uses affinity-purified transcription factors (TFs) to capture genomic DNA fragments, was applied to all 1,725 Arabidopsis thaliana TFs. High confidence TFBS motifs for 529 TFs and genome-wide enrichment maps for 349 factors were identified. In total,~ 2.7 million TFBS were identified which predict thousands of TF target genes enriched for known and novel functions.. Comparison of TF-binding using cytosine-methylated and -unmethylated genomic DNA revealed a 2-50 fold inhibition at methylated motifs for ~82% (264) of factors tested while 4.6% (15) showed stronger binding to methylated motifs. Finally, we describe how binding of Arabidopsis and maize Auxin Response Factors (ARFs) at phased motif repeats is highly enriched at ARF target gene promoters and how this architecture may allow for stabilization of dimers/multimers.
Identification of sequence motifs for 530 transcription factors (529 Arabidopsis thaliana, 1 Zea mays) and genome-wide binding sites for 350 transcription factors (349 Arabidopsis thaliana, 1 Zea mays) by direct sequencing of affinity purified genomic DNA fragments. Idenfication of binding sites for 343 transcription factors (Arabidopsis thaliana) by direct sequencing of affinity purified, PCR amplified DNA libraries. Comparison to ChIP-Seq for ABI5 (AT2G36270). The 530 transcription factors refer to the number of unique proteins assayed on non-amplified DNA libraries (i.e. "characteristics: DNA source" = "col" or "zm"). Additional 5 factors have only data on amplified libraries (i.e. "characteristics: DNA source = colamp"), which do not represent the natural DNA methylation state of the organism (therefore not counted; more details in the "description" field). A subset of 350 of the 530 have both motif and TFBS identification (i.e. "characteristics: subset = TFBS+motif"), and the rest 180 have only motif identification (i.e. "characteristics: subset = motif").