National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
Sequence alignment is one of the most important bioinformatics tools for modern molecular biology. The statistical characterization of gapped alignment scores has been a long-standing problem in sequence alignment research. In this paper, we provide a self-contained exposition of sequence alignment, a short review about how this problem is related to the directed polymer problem in statistical physics, and some analytical results that can be used for predicting alignment score statistics. Basically, we present two classes of solutions for the gapped alignment statistics by explicitly calculating the evolution of the few-replica partition function in 1+1 dimensions. We have obtained the conditions under which the more important extremal parameter lambda, characterizing the alignment score statistics, becomes predictable.