Format

Send to

Choose Destination
J Bioinform Comput Biol. 2017 Feb;15(1):1650025. doi: 10.1142/S0219720016500256. Epub 2016 Jun 14.

A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features.

Author information

1
* Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China.
2
† National Drug Clinical Trial Institution, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China.
3
‡ Institute of Cancer, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China.
4
§ Department of Mathematics, Shanghai Normal University, Shanghai 200234, China.

Abstract

Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.

KEYWORDS:

Amino acid physicochemical properties; K-mer amino acid composition; palmitoylation; position-specific score matrix; support vector machine-recursive feature elimination

PMID:
27411307
DOI:
10.1142/S0219720016500256
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center