Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2007 Jul 1;23(13):i72-9.

RankMotif++: a motif-search algorithm that accounts for relative ranks of K-mers in binding transcription factors.

Author information

  • 1Banting and Best Department of Medical Research, University of Toronto, Toronto, ON, Canada.

Abstract

MOTIVATION:

The sequence specificity of DNA-binding proteins is typically represented as a position weight matrix in which each base position contributes independently to relative affinity. Assessment of the accuracy and broad applicability of this representation has been limited by the lack of extensive DNA-binding data. However, new microarray techniques, in which preferences for all possible K-mers are measured, enable a broad comparison of both motif representation and methods for motif discovery. Here, we consider the problem of accounting for all of the binding data in such experiments, rather than the highest affinity binding data. We introduce the RankMotif++, an algorithm designed for finding motifs whenever sequences are associated with a semi-quantitative measure of protein-DNA-binding affinity. RankMotif++ learns motif models by maximizing the likelihood of a set of binding preferences under a probabilistic model of how sequence binding affinity translates into binding preference observations. Because RankMotif++ makes few assumptions about the relationship between binding affinity and the semi-quantitative readout, it is applicable to a wide variety of experimental assays of DNA-binding preference.

RESULTS:

By several criteria, RankMotif++ predicts binding affinity better than two widely used motif finding algorithms (MDScan, MatrixREDUCE) or more recently developed algorithms (PREGO, Seed and Wobble), and its performance is comparable to a motif model that separately assigns affinities to 8-mers. Our results validate the PWM model and provide an approximation of the precision and recall that can be expected in a genomic scan.

AVAILABILITY:

RankMotif++ is available upon request.

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
17646348
[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk