Display Settings:

Format

Send to:

Choose Destination
    J Comput Biol. 2000;7(3-4):345-62.

    Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification.

    Source

    Institut Gaspard Monge, Université de Marne la Vallée 5.

    Abstract

    This paper introduces two exact algorithms for extracting conserved structured motifs from a set of DNA sequences. Structured motifs may be described as an ordered collection of p > or = 1 "boxes" (each box corresponding to one part of the structured motif), p substitution rates (one for each box) and p - 1 intervals of distance (one for each pair of successive boxes in the collection). The contents of the boxes--that is, the motifs themselves--are unknown at the start of the algorithm. This is precisely what the algorithms are meant to find. A suffix tree is used for finding such motifs. The algorithms are efficient enough to be able to infer site consensi, such as, for instance, promoter sequences or regulatory sites, from a set of unaligned sequences corresponding to the noncoding regions upstream from all genes of a genome. In particular, both algorithms time complexity scales linearly with N2n where n is the average length of the sequences and N their number. An application to the identification of promoter and regulatory consensus sequences in bacterial genomes is shown.

    PMID:
    11108467
    [PubMed - indexed for MEDLINE]

      Supplemental Content

      Icon for Mary Ann Liebert, Inc.

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk