Using Deep Learning to Identify High-Risk Patients with Heart Failure with Reduced Ejection Fraction

Zhibo Wang; Xin Chen; Xi Tan; Lingfeng Yang; Kartik Kannapur; Justin L Vincent; Garin N Kessler; Boshu Ru; Mei Yang

doi:10.36469/jheor.2021.25753

Using Deep Learning to Identify High-Risk Patients with Heart Failure with Reduced Ejection Fraction

J Health Econ Outcomes Res. 2021 Jul 29;8(2):6-13. doi: 10.36469/jheor.2021.25753. eCollection 2021.

Authors

Zhibo Wang¹, Xin Chen², Xi Tan², Lingfeng Yang², Kartik Kannapur³, Justin L Vincent³, Garin N Kessler⁴, Boshu Ru², Mei Yang²

Affiliations

¹ Merck & Co., Inc., Kenilworth, NJ, USA; College of Engineering and Computer Science, University of Central Florida, Orlando, FL, USA.
² Merck & Co., Inc., Kenilworth, NJ, USA.
³ Amazon Web Services Inc., Seattle, WA, USA.
⁴ Amazon Web Services Inc., Seattle, WA, USA; Georgetown University, Seattle, WA, USA.

Abstract

Background: Deep Learning (DL) has not been well-established as a method to identify high-risk patients among patients with heart failure (HF). Objectives: This study aimed to use DL models to predict hospitalizations, worsening HF events, and 30-day and 90-day readmissions in patients with heart failure with reduced ejection fraction (HFrEF). Methods: We analyzed the data of adult HFrEF patients from the IBM® MarketScan® Commercial and Medicare Supplement databases between January 1, 2015 and December 31, 2017. A sequential model architecture based on bi-directional long short-term memory (Bi-LSTM) layers was utilized. For DL models to predict HF hospitalizations and worsening HF events, we utilized two study designs: with and without a buffer window. For comparison, we also tested multiple traditional machine learning models including logistic regression, random forest, and eXtreme Gradient Boosting (XGBoost). Model performance was assessed by area under the curve (AUC) values, precision, and recall on an independent testing dataset. Results: A total of 47 498 HFrEF patients were included; 9427 with at least one HF hospitalization. The best AUCs of DL models without a buffer window in predicting HF hospitalizations and worsening HF events in the total patient cohort were 0.977 and 0.972; with a 7-day buffer window the best AUCs were 0.573 and 0.608, respectively. The best AUCs in predicting 30- and 90-day readmissions in all adult patients were 0.597 and 0.614, respectively. An AUC of 0.861 was attained for prediction of 90-day readmission in patients aged 18-64. For all outcomes assessed, the DL approach outperformed traditional machine learning models. Discussion: The DL approach can automate feature engineering during the model learning, which can increase the clinical applicability and lead to comparable or better model performance. However, the lack of granular clinical data, and sample size and imbalance issues may have limited the model's performance. Conclusions: A DL approach using Bi-LSTM was shown to be a feasible and useful tool to predict HF-related outcomes. This study can help inform the future development and deployment of predictive tools to identify high-risk HFrEF patients and ultimately facilitate targeted interventions in clinical practice.

Keywords: deep learning; heart failure; hospitalizations; machine learning; readmissions; worsening events.

Grants and funding

Funding was provided by Merck Sharp & Dohme Corp., a subsidiary of Merck & Co., Inc., Kenilworth, NJ, USA.