Send to

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2008 Sep 1;24(17):1850-7. doi: 10.1093/bioinformatics/btn331. Epub 2008 Jun 27.

Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors.

Author information

Department of Genetics, Washington University School of Medicine, 660 S Euclid, Box 8232, St. Louis, MO 63110, USA.



Modeling and identifying the DNA-protein recognition code is one of the most challenging problems in computational biology. Several quantitative methods have been developed to model DNA-protein interactions with specific focus on the C(2)H(2) zinc-finger proteins, the largest transcription factor family in eukaryotic genomes. In many cases, they performed well. But the overall the predictive accuracy of these methods is still limited. One of the major reasons is all these methods used weight matrix models to represent DNA-protein interactions, assuming all base-amino acid contacts contribute independently to the total free energy of binding.


We present a context-dependent model for DNA-zinc-finger protein interactions that allows us to identify inter-positional dependencies in the DNA recognition code for C(2)H(2) zinc-finger proteins. The degree of non-independence was detected by comparing the linear perceptron model with the non-linear neural net (NN) model for their predictions of DNA-zinc-finger protein interactions. This dependency is supported by the complex base-amino acid contacts observed in DNA-zinc-finger interactions from structural analyses. Using extensive published qualitative and quantitative experimental data, we demonstrated that the context-dependent model developed in this study can significantly improves predictions of DNA binding profiles and free energies of binding for both individual zinc fingers and proteins with multiple zinc fingers when comparing to previous positional-independent models. This approach can be extended to other protein families with complex base-amino acid residue interactions that would help to further understand the transcriptional regulation in eukaryotic genomes.


The software implemented as c programs and are available by request.

[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems Icon for PubMed Central
    Loading ...
    Support Center