Sequencing-by-hybridization at the information-theory bound: an optimal algorithm

J Comput Biol. 2000;7(3-4):621-30. doi: 10.1089/106652700750050970.

Abstract

In a recent paper (Preparata et aL, 1999) we introduced a novel probing scheme for DNA sequencing by hybridization (SBH). The new gapped-probe scheme combines natural and universal bases in a well-defined periodic pattern. It has been shown (Preparata et al, 1999) that the performance of the gapped-probe scheme (in terms of the length of a sequence that can be uniquely reconstructed using a given size library of probes) is significantly better than the standard scheme based on oligomer probes. In this paper we present and analyze a new, more powerful, sequencing algorithm for the gapped-probe scheme. We prove that the new algorithm exploits the full potential of the SBH technology with high-confidence performance that comes within a small constant factor (about 2) of the information-theory bound. Moreover, this performance is achieved while maintaining running time linear in the target sequence length.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Computational Biology
  • Models, Statistical
  • Nucleic Acid Hybridization
  • Sequence Analysis, DNA / statistics & numerical data*