Format

Send to

Choose Destination
Pac Symp Biocomput. 2002:564-75.

The spectrum kernel: a string kernel for SVM protein classification.

Author information

1
Department of Computer Science, Columbia University, New York, NY 10027, USA. cleslie.noble@cs.columbia.edu

Abstract

We introduce a new sequence-similarity kernel, the spectrum kernel, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem. Our kernel is conceptually simple and efficient to compute and, in experiments on the SCOP database, performs well in comparison with state-of-the-art methods for homology detection. Moreover, our method produces an SVM classifier that allows linear time classification of test sequences. Our experiments provide evidence that string-based kernels, in conjunction with SVMs, could offer a viable and computationally efficient alternative to other methods of protein classification and homology detection.

PMID:
11928508
[Indexed for MEDLINE]
Free full text

Supplemental Content

Full text links

Icon for Pacific Sympsium On Biocomputing
Loading ...
Support Center