Display Settings:

Format

Send to:

Choose Destination
J Comput Biol. 2010 Jan;17(1):55-72. doi: 10.1089/cmb.2009.0029.

Graphlet kernels for prediction of functional residues in protein structures.

Author information

  • 1Department of Computer Science and Engineering, University of California, Riverside, California, USA.

Abstract

We introduce a novel graph-based kernel method for annotating functional residues in protein structures. A structure is first modeled as a protein contact graph, where nodes correspond to residues and edges connect spatially neighboring residues. Each vertex in the graph is then represented as a vector of counts of labeled non-isomorphic subgraphs (graphlets), centered on the vertex of interest. A similarity measure between two vertices is expressed as the inner product of their respective count vectors and is used in a supervised learning framework to classify protein residues. We evaluated our method on two function prediction problems: identification of catalytic residues in proteins, which is a well-studied problem suitable for benchmarking, and a much less explored problem of predicting phosphorylation sites in protein structures. The performance of the graphlet kernel approach was then compared against two alternative methods, a sequence-based predictor and our implementation of the FEATURE framework. On both tasks, the graphlet kernel performed favorably; however, the margin of difference was considerably higher on the problem of phosphorylation site prediction. While there is data that phosphorylation sites are preferentially positioned in intrinsically disordered regions, we provide evidence that for the sites that are located in structured regions, neither the surface accessibility alone nor the averaged measures calculated from the residue microenvironments utilized by FEATURE were sufficient to achieve high accuracy. The key benefit of the graphlet representation is its ability to capture neighborhood similarities in protein structures via enumerating the patterns of local connectivity in the corresponding labeled graphs.

PMID:
20078397
[PubMed - indexed for MEDLINE]
PMCID:
PMC2921594
Free PMC Article

Images from this publication.See all images (5)Free text

FIG. 1.
FIG. 2.
FIG. 3.
FIG. 4.
FIG. 5.
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for Mary Ann Liebert, Inc. Icon for PubMed Central
    Loading ...
    Write to the Help Desk