Format

Send to

Choose Destination
Bioinformatics. 2012 Apr 15;28(8):1070-7. doi: 10.1093/bioinformatics/bts102. Epub 2012 Mar 7.

YOABS: yet other aligner of biological sequences--an efficient linearly scaling nucleotide aligner.

Author information

1
Qualg Inc., La Jolla, CA, 92037, USA. vit@ucsd.edu

Abstract

MOTIVATION:

Explosive growth of short-read sequencing technologies in the recent years resulted in rapid development of many new alignment algorithms and programs. But most of them are not efficient or not applicable for reads > or approximately equal to 200 bp because these algorithms specifically designed to process short queries with relatively low sequencing error rates. However, the current trend to increase reliability of detection of structural variations in assembled genomes as well as to facilitate de novo sequencing demand complimenting high-throughput short-read platforms with long-read mapping. Thus, algorithms and programs for efficient mapping of longer reads are becoming crucial. However, the choice of long-read aligners effective in terms of both performance and memory are limited and includes only handful of hash table (BLAT, SSAHA2) or trie (Burrows-Wheeler Transform - Smith-Waterman (BWT-SW), Burrows-Wheeler Alignerr - Smith-Waterman (BWA-SW)) based algorithms.

RESULTS:

New O(n) algorithm that combines the advantages of both hash and trie-based methods has been designed to effectively align long biological sequences (> or approximately equal to 200 bp) against a large sequence database with small memory footprint (e.g. ~2 GB for the human genome). The algorithm is accurate and significantly more fast than BLAT or BWT-SW, but similar to BWT-SW it can find all local alignments. It is as accurate as SSAHA2 or BWA-SW, but uses 3+ times less memory and 10+ times faster than SSAHA2, several times faster than BWA-SW with low error rates and almost two times less memory.

AVAILABILITY AND IMPLEMENTATION:

The prototype implementation of the algorithm will be available upon request for non-commercial use in academia (local hit table binary and indices are at ftp://styx.ucsd.edu).

PMID:
22402614
DOI:
10.1093/bioinformatics/bts102
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center