Format

Send to

Choose Destination
J Comput Biol. 2008 Nov;15(9):1187-94. doi: 10.1089/cmb.2008.0125.

Significance of gapped sequence alignments.

Author information

1
Center for Bioinformatics, Wadsworth Center, New York State Department of Health, Albany, New York 12201-0509, USA.

Abstract

Measurement of the the statistical significance of extreme sequence alignment scores is key to many important applications, but it is difficult. To precisely approximate alignment score significance, we draw random samples directly from a well chosen, importance-sampling probability distribution. We apply our technique to pairwise local sequence alignment of nucleic acid and amino acid sequences of length up to 1000. For instance, using a BLOSUM62 scoring system for local sequence alignment, we compute that the p-value of a score of 6000 for the alignment of two sequences of length 1000 is (3.4 +/- 0.3) x 10(-1314). Further, we show that the extreme value significance statistic for the local alignment model that we examine does not follow a Gumbel distribution. A web server for this application is available at http://bayesweb.wadsworth.org/alignmentSignificanceV1/.

PMID:
18973434
PMCID:
PMC2737730
DOI:
10.1089/cmb.2008.0125
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Atypon Icon for PubMed Central
Loading ...
Support Center