Display Settings:

Format

Send to:

Choose Destination
We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
    Bioinformatics. 2010 Jul 15;26(14):1714-22. doi: 10.1093/bioinformatics/btq267. Epub 2010 May 26.

    Prediction of protease substrates using sequence and structure features.

    Source

    Graduate Group in Bioinformatics, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA.

    Abstract

    MOTIVATION:

    Granzyme B (GrB) and caspases cleave specific protein substrates to induce apoptosis in virally infected and neoplastic cells. While substrates for both types of proteases have been determined experimentally, there are many more yet to be discovered in humans and other metazoans. Here, we present a bioinformatics method based on support vector machine (SVM) learning that identifies sequence and structural features important for protease recognition of substrate peptides and then uses these features to predict novel substrates. Our approach can act as a convenient hypothesis generator, guiding future experiments by high-confidence identification of peptide-protein partners.

    RESULTS:

    The method is benchmarked on the known substrates of both protease types, including our literature-curated GrB substrate set (GrBah). On these benchmark sets, the method outperforms a number of other methods that consider sequence only, predicting at a 0.87 true positive rate (TPR) and a 0.13 false positive rate (FPR) for caspase substrates, and a 0.79 TPR and a 0.21 FPR for GrB substrates. The method is then applied to approximately 25 000 proteins in the human proteome to generate a ranked list of predicted substrates of each protease type. Two of these predictions, AIF-1 and SMN1, were selected for further experimental analysis, and each was validated as a GrB substrate.

    AVAILABILITY:

    All predictions for both protease types are publically available at http://salilab.org/peptide. A web server is at the same site that allows a user to train new SVM models to make predictions for any protein that recognizes specific oligopeptide ligands.

    PMID:
    20505003
    [PubMed - indexed for MEDLINE]
    PMCID:
    PMC2894511
    Free PMC Article

    Images from this publication.See all images (5)Free text

    Fig. 1.
    Fig. 2.
    Fig. 3.
    Fig. 4.
    Fig. 5.

      Supplemental Content

      Icon for HighWire Icon for PubMed Central

      Save items

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk