Format

Send to

Choose Destination
J Comput Biol. 2002;9(2):211-23.

Algorithms for phylogenetic footprinting.

Author information

1
Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195-2350, USA. blanchem@cs.washington.edu

Abstract

Phylogenetic footprinting is a technique that identifies regulatory elements by finding unusually well conserved regions in a set of orthologous noncoding DNA sequences from multiple species. We introduce a new motif-finding problem, the Substring Parsimony Problem, which is a formalization of the ideas behind phylogenetic footprinting, and we present an exact dynamic programming algorithm to solve it. We then present a number of algorithmic optimizations that allow our program to run quickly on most biologically interesting datasets. We show how to handle data sets in which only an unknown subset of the sequences contains the regulatory element. Finally, we describe how to empirically assess the statistical significance of the motifs found. Each technique is implemented and successfully identifies a number of known binding sites, as well as several highly conserved but uncharacterized regions. The program is available at http://bio.cs.washington.edu/software.html.

PMID:
12015878
DOI:
10.1089/10665270252935421
[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center