Send to

Choose Destination
Comput Biol Chem. 2005 Dec;29(6):440-3. Epub 2005 Nov 10.

SVM-BALSA: remote homology detection based on Bayesian sequence alignment.

Author information

Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA.


Biopolymer sequence comparison to identify evolutionarily related proteins, or homologs, is one of the most common tasks in bioinformatics. Support vector machines (SVMs) represent a new approach to the problem in which statistical learning theory is employed to classify proteins into families, thus identifying homologous relationships. Current SVM approaches have been shown to outperform iterative profile methods, such as PSI-BLAST, for protein homology classification. In this study, we demonstrate that the utilization of a Bayesian alignment score, which accounts for the uncertainty of all possible alignments, in the SVM construction improves sensitivity compared to the traditional dynamic programming implementation over a benchmark dataset consisting of 54 unique protein families. The SVM-BALSA algorithms returns a higher area under the receiver operating characteristic (ROC) curves for 37 of the 54 families and achieves an improved overall performance curve at a significance level of 0.07.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center