Format

Send to

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2007 Mar 8;8:84.

Statistical tests to compare motif count exceptionalities.

Author information

1
INA PG/ENGREF/INRA, UMR518 Unité Mathématiques et Informatique Appliquées, 75005 Paris, France. robin@inapg.inra.fr

Abstract

BACKGROUND:

Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required.

RESULTS:

We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops.

CONCLUSION:

The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use.

PMID:
17346349
PMCID:
PMC1838430
DOI:
10.1186/1471-2105-8-84
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Support Center