Prediction of presynaptic and postsynaptic neurotoxins based on feature extraction

Wen Zhu; Yuxin Guo; Quan Zou

doi:10.3934/mbe.2021297

Prediction of presynaptic and postsynaptic neurotoxins based on feature extraction

Math Biosci Eng. 2021 Jun 30;18(5):5943-5958. doi: 10.3934/mbe.2021297.

Authors

Wen Zhu^{1

2

3}, Yuxin Guo^{1

2

3}, Quan Zou⁴

Affiliations

¹ Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.
² Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.
³ School of Mathematics and Statistics, Hainan Normal University, Haikou, China.
⁴ Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.

PMID: 34517517
DOI: 10.3934/mbe.2021297

Abstract

A neurotoxin is essentially a protein that mainly acts on the nervous system; it has a selective toxic effect on the central nervous system and neuromuscular nodes, can cause muscle paralysis and respiratory paralysis, and has strong lethality. According to their principle of action, neurotoxins are divided into presynaptic neurotoxins and postsynaptic neurotoxins. Correctly identifying presynaptic and postsynaptic nerve toxins provides important clues for future drug development and the discovery of drug targets. Therefore, a predictive model, Neu_LR, was constructed in this paper. The monoMonokGap method was used to extract the frequency characteristics of presynaptic and postsynaptic neurotoxin sequences and carry out feature selection, then, based on the important features obtained after dimensionality reduction, the prediction model Neu_LR was constructed using a logistic regression algorithm, and ten-fold cross-validation and independent test set validation were used. The final accuracy rates were 99.6078 and 94.1176%, respectively, which proved that the Neu_LR model had good predictive performance and robustness, and could meet the prediction requirements of presynaptic and postsynaptic neurotoxins. The data and source code of the model can be freely download from https://github.com/gyx123681/.

Keywords: Neu_LR; logistic regression; monoMonokGap; neurotoxin; protein classification.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Neurotoxins* / toxicity

Substances

Neurotoxins