Display Settings:

Format

Send to:

Choose Destination

    Nucleic Acids Res. 2005 Apr 22;33(7):2290-301. Print 2005.

    Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites.

    Gershenzon NI, Stormo GD, Ioshikhes IP.

    Department of Biomedical Informatics, The Ohio State University 3184 Graves Hall, 333 W. 10th Avenue, Columbus, OH 43210, USA. gershenzon-1@medctr.osu.edu

    Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden-Bucher approach, that improves the PWM. We applied the proposed technique on the PWM of the GC-box, binding site for Sp1. The comparison of old and new PWMs shows that the latter increase both sensitivity and specificity. The statistical parameters of GC-box distribution in promoter regions and in the human genome, as well as in each chromosome, are presented. The majority of commonly used PWMs are the 4-row mononucleotide matrices, although 16-row dinucleotide matrices are known to be more informative. The algorithm efficiently determines the 16-row matrices and preliminary results show that such matrices provide better results than 4-row matrices.

    PMID: 15849315 [PubMed - indexed for MEDLINE]

    PMCID: 1084321

    Supplemental Content

    Click here to read Click here to read Click here to read Click here to read