Format

Send to

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2009 Jun 9;10:175. doi: 10.1186/1471-2105-10-175.

Local alignment of two-base encoded DNA sequence.

Author information

1
Department of Computer Science, University of California Los Angeles, Los Angeles, California 90095, USA. nhomer@cs.ucla.edu

Abstract

BACKGROUND:

DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity.

RESULTS:

We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions.

CONCLUSION:

The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data.

PMID:
19508732
PMCID:
PMC2709925
DOI:
10.1186/1471-2105-10-175
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Support Center