Format

Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2001 Apr;17(4):349-58.

Multi-class protein fold recognition using support vector machines and neural networks.

Author information

  • 1NERSC Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA.

Abstract

MOTIVATION:

Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and examined many issues important for a practical recognition system.

RESULTS:

Most current discriminative methods for protein fold prediction use the one-against-others method, which has the well-known 'False Positives' problem. We investigated two new methods: the unique one-against-others and the all-against-all methods. Both improve prediction accuracy by 14-110% on a dataset containing 27 SCOP folds. We used the Support Vector Machine (SVM) and the Neural Network (NN) learning methods as base classifiers. SVMs converges fast and leads to high accuracy. When scores of multiple parameter datasets are combined, majority voting reduces noise and increases recognition accuracy. We examined many issues involved with large number of classes, including dependencies of prediction accuracy on the number of folds and on the number of representatives in a fold. Overall, recognition systems achieve 56% fold prediction accuracy on a protein test dataset, where most of the proteins have below 25% sequence identity with the proteins used in training.

PMID:
11301304
[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk