Format

Send to

Choose Destination
See comment in PubMed Commons below
Nucleic Acids Res. 2004 Jul 7;32(12):3522-30. Print 2004.

Sequence-based prediction of protein domains.

Author information

1
CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA. liu@cubic.bioc.columbia.edu

Abstract

Guessing the boundaries of structural domains has been an important and challenging problem in experimental and computational structural biology. Predictions were based on intuition, biochemical properties, statistics, sequence homology and other aspects of predicted protein structure. Here, we introduced CHOPnet, a de novo method that predicts structural domains in the absence of homology to known domains. Our method was based on neural networks and relied exclusively on information available for all proteins. Evaluating sustained performance through rigorous cross-validation on proteins of known structure, we correctly predicted the number of domains in 69% of all proteins. For 50% of the two-domain proteins the centre of the predicted boundary was closer than 20 residues to the boundary assigned from three-dimensional (3D) structures; this was about eight percentage points better than predictions by 'equal split'. Our results appeared to compare favourably with those from previously published methods. CHOPnet may be useful to restrict the experimental testing of different fragments for structure determination in the context of structural genomics.

PMID:
15240828
PMCID:
PMC484172
DOI:
10.1093/nar/gkh684
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems Icon for PubMed Central
    Loading ...
    Support Center