Machine learning to predict mortality after rehabilitation among patients with severe stroke

Domenico Scrutinio; Carlo Ricciardi; Leandro Donisi; Ernesto Losavio; Petronilla Battista; Pietro Guida; Mario Cesarelli; Gaetano Pagano; Giovanni D'Addio

doi:10.1038/s41598-020-77243-3

Machine learning to predict mortality after rehabilitation among patients with severe stroke

Sci Rep. 2020 Nov 18;10(1):20127. doi: 10.1038/s41598-020-77243-3.

Authors

Affiliations

¹ Istituti Clinici Scientifici Maugeri IRCCS, Pavia, Italy.
² Istituti Clinici Scientifici Maugeri IRCCS, Pavia, Italy. carloricciardi.93@gmail.com.
³ Department of Advanced Biomedical Sciences, University Hospital of Naples "Federico II", Naples, Italy. carloricciardi.93@gmail.com.
⁴ Department of Advanced Biomedical Sciences, University Hospital of Naples "Federico II", Naples, Italy.
⁵ Department of Electrical Engineering and Information Technology, University of Naples "Federico II", Naples, Italy.

Abstract

Stroke is among the leading causes of death and disability worldwide. Approximately 20-25% of stroke survivors present severe disability, which is associated with increased mortality risk. Prognostication is inherent in the process of clinical decision-making. Machine learning (ML) methods have gained increasing popularity in the setting of biomedical research. The aim of this study was twofold: assessing the performance of ML tree-based algorithms for predicting three-year mortality model in 1207 stroke patients with severe disability who completed rehabilitation and comparing the performance of ML algorithms to that of a standard logistic regression. The logistic regression model achieved an area under the Receiver Operating Characteristics curve (AUC) of 0.745 and was well calibrated. At the optimal risk threshold, the model had an accuracy of 75.7%, a positive predictive value (PPV) of 33.9%, and a negative predictive value (NPV) of 91.0%. The ML algorithm outperformed the logistic regression model through the implementation of synthetic minority oversampling technique and the Random Forests, achieving an AUC of 0.928 and an accuracy of 86.3%. The PPV was 84.6% and the NPV 87.5%. This study introduced a step forward in the creation of standardisable tools for predicting health outcomes in individuals affected by stroke.

MeSH terms

Aged
Algorithms*
Clinical Decision-Making
Female
Humans
Logistic Models
Machine Learning*
Male
Medicare
Middle Aged
Mortality
ROC Curve
Stroke / etiology*
Stroke / mortality
Stroke Rehabilitation / mortality*
United States