Comparison of four variable selection methods to determine the important variables in predicting the prognosis of traumatic brain injury patients by support vector machine

J Res Med Sci. 2019 Nov 27:24:97. doi: 10.4103/jrms.JRMS_89_18. eCollection 2019.

Abstract

Background: Large amounts of information have called for increased computational complexity. Data dimension reduction is therefore critical to preliminary analysis. In this research, four variable selection (VS) methods are compared to obtain the important variables in predicting the prognosis of traumatic brain injury (TBI) patients.

Materials and methods: In a retrospective follow-up study, 741 TBI patients who were hospitalized for at least 2 days and had a Glasgow Coma Scale score of at least one were followed. Their clinical data recorded during intensive care unit (ICU) admission and eight-category extended GOS conditions 6 months after discharge were utilized here. Two filter- and two wrapper-based VS methods were applied for comparison. A support vector machine (SVM) classifier was then used, and the sensitivity, specificity, accuracy, and the area under the receiver characteristic curve (AUC) values were calculated.

Results: Theoretically, the variables selected by sequential forward selection (SFS) method would better predict the prognosis (AUC = 0.737, 95% confidence interval [0.701, 0.772], specificity = 89.2%, sensitivity = 58.9% and accuracy = 79.1%) than the others. Genetic algorithm (GA), minimum redundancy maximum relevance (MRMR), and mutual information method were in the next orders, respectively.

Conclusion: The use of an SVM classifier on optimal subsets given by GA and SFS reveals that wrapper-based methods perform better than filter-based methods in our data set, although all selected subsets, except for the MRMR, were clinically accepted. In addition, for prognosis prediction of TBI patients, a small subset of clinical records during ICU admission is enough to achieve an accepted accuracy.

Keywords: Variable selection; filter; prediction; prognosis; support vector machine; traumatic brain injury; wrapper.