Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
J Mol Biol. 2004 Apr 23;338(2):207-15.

Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics.

Author information

  • 1Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden.

Abstract

Diverse computational and experimental efforts are required to elucidate the control circuitry regulating the transcription of human genes. The fusion of gene-specific promoter analyses with large microarray studies and bioinformatics advances has produced optimism that significant progress can be made in unravelling this complex network. Within bioinformatics, past emphasis for improved pattern discovery has been placed upon "phylogenetic footprinting", the identification of sequences conserved over moderate periods of evolution (e.g. human and mouse comparisons). We introduce a new direction in bioinformatics based on the constraints imposed by the structures of DNA-binding proteins. For most structurally related families of transcription factors, there are clear similarities in the sequences of the sites to which they bind. On the basis of this observation, we construct familial binding profiles for well-characterized transcription factor families. The profiles are shown to classify correctly the structural class of mediating transcription factors for novel motifs in 88% of cases. By incorporating the familial profiles into pattern discovery procedures, we demonstrate that functional binding sites can be found in genomic sequences of dramatically greater length than is possible otherwise. Thus, incorporating familial models can overcome the signal-to-noise challenge that has hindered the transition from microarray data to regulatory control sequences for human genes. Biochemically motivated constraints upon sequence diversity of binding sites will complement the genetically motivated constraints imposed in "phylogenetic footprinting" algorithms.

PMID:
15066426
[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science
    Loading ...
    Write to the Help Desk