Send to

Choose Destination
Brief Bioinform. 2012 May;13(3):337-49. doi: 10.1093/bib/bbr059. Epub 2011 Dec 2.

A review of statistical methods for prediction of proteolytic cleavage.

Author information

Bioinformatics Center, Kyoto University, Uji, Kyoto 611-0011, Japan.


A fundamental component of systems biology, proteolytic cleavage is involved in nearly all aspects of cellular activities: from gene regulation to cell lifecycle regulation. Current sequencing technologies have made it possible to compile large amount of cleavage data and brought greater understanding of the underlying protein interactions. However, the practical impossibility to exhaustively retrieve substrate sequences through experimentation alone has long highlighted the need for efficient computational prediction methods. Such methods must be able to quickly mark substrate candidates and putative cleavage sites for further analysis. Available methods and expected reliability depend heavily on the type and complexity of proteolytic action, as well as the availability of well-labelled experimental data sets: factors varying greatly across enzyme families. For this review, we chose to give a quick overview of the general issues and challenges in cleavage prediction methods followed by a more in-depth presentation of major techniques and implementations, with a focus on two particular families of cysteine proteases: caspases and calpains. Through their respective differences in proteolytic specificity (high for caspases, broader for calpains) and data availability (much lower for calpains), we aimed to illustrate the strengths and limitations of techniques ranging from position-based matrices and decision trees to more flexible machine-learning methods such as hidden Markov models and Support Vector Machines. In addition to a technical overview for each family of algorithms, we tried to provide elements of evaluation and performance comparison across methods.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center