Format

Send to

Choose Destination
Brief Bioinform. 2014 Mar;15(2):138-54. doi: 10.1093/bib/bbt081. Epub 2014 Jan 10.

A bioinformatician's guide to the forefront of suffix array construction algorithms.

Author information

1
Computational Biology Research Center, AIST, Tokyo, Japan. computome@gmail.com.

Abstract

The suffix array and its variants are text-indexing data structures that have become indispensable in the field of bioinformatics. With the uninitiated in mind, we provide an accessible exposition of the SA-IS algorithm, which is the state of the art in suffix array construction. We also describe DisLex, a technique that allows standard suffix array construction algorithms to create modified suffix arrays designed to enable a simple form of inexact matching needed to support 'spaced seeds' and 'subset seeds' used in many biological applications.

KEYWORDS:

linear-time algorithm; spaced seeds; subset seeds; suffix array construction; text index

PMID:
24413184
PMCID:
PMC3956071
DOI:
10.1093/bib/bbt081
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center