Format

Send to

Choose Destination
BMC Res Notes. 2012 Jun 12;5:286. doi: 10.1186/1756-0500-5-286.

New finite-size correction for local alignment score distributions.

Author information

1
National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA.

Abstract

BACKGROUND:

Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a "finite-size" correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score.

FINDINGS:

We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences.

CONCLUSIONS:

The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST+ package and at the NCBI BLAST web site ( http://blast.ncbi.nlm.nih.gov).

PMID:
22691307
PMCID:
PMC3483159
DOI:
10.1186/1756-0500-5-286
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center