Format

Send to

Choose Destination
Genome Inform. 2008;21:15-26.

Factoring local sequence composition in motif significance analysis.

Author information

1
Department of Computer Science, Cornell University, Ithaca, NY 14853, USA.

Abstract

We recently introduced a biologically realistic and reliable significance analysis of the output of a popular class of motif finders. In this paper we further improve our significance analysis by incorporating local base composition information. Relying on realistic biological data simulation, as well as on FDR analysis applied to real data, we show that our method is significantly better than the increasingly popular practice of using the normal approximation to estimate the significance of a finder's output. Finally we turn to leveraging our reliable significance analysis to improve the actual motif finding task. Specifically, endowing a variant of the Gibbs Sampler with our improved significance analysis we demonstrate that de novo finders can perform better than has been perceived. Significantly, our new variant outperforms all the finders reviewed in a recently published comprehensive analysis of the Harbison genome-wide binding location data. Interestingly, many of these finders incorporate additional information such as nucleosome positioning and the significance of binding data.

PMID:
19425144
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for The DNA Replication Origin Database
Loading ...
Support Center