Display Settings:

Format

Send to:

Choose Destination
    J Mol Biol. 1991 Oct 20;221(4):1367-78.

    An efficient algorithm for identifying matches with errors in multiple long molecular sequences.

    Source

    Division of Mathematics, Computer Science and Statistics, University of Texas, San Antonio 78249-0664.

    Abstract

    An efficient algorithm is described for finding matches, repeats and other word relations, allowing for errors, in large data sets of long molecular sequences. The algorithm entails hashing on fixed-size words in conjunction with the use of a linked list connecting all occurrences of the same word. The average memory and run time requirement both increase almost linearly with the total sequence length. Some results of the program's performance on a database of Escherichia coli DNA sequences are presented.

    PMID:
    1942056
    [PubMed - indexed for MEDLINE]

      Supplemental Content

      Icon for Elsevier Science

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk