Format

Send to

Choose Destination
IEEE Trans Nanobioscience. 2018 Jul;17(3):243-250. doi: 10.1109/TNB.2018.2842219. Epub 2018 May 31.

XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction.

Abstract

Essential proteins as a vital part of maintaining the cells' life play an important role in the study of biology and drug design. With the generation of large amounts of biological data related to essential proteins, an increasing number of computational methods have been proposed. Different from the methods which adopt a single machine learning method or an ensemble machine learning method, this paper proposes a predicting framework named by XGBFEMF for identifying essential proteins, which includes a SUB-EXPAND-SHRINK method for constructing the composite features with original features and obtaining the better subset of features for essential protein prediction, and also includes a model fusion method for getting a more effective prediction model. We carry out experiments on Yeast data to assess the performance of the XGBFEMF with ROC analysis, accuracy analysis, and top analysis. Meanwhile, we set up experiments on E. coli data for the validation of performance. The test results show that the XGBFEMF framework can effectively improve many essential indicators. In addition, we analyze each step in the XGBFEMF framework; our results show that both each step of the SUB-EXPAND-SHRINK method as well as the step of multi-model fusion can improve prediction performance.

PMID:
29993553
DOI:
10.1109/TNB.2018.2842219
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for IEEE Engineering in Medicine and Biology Society
Loading ...
Support Center