A Hybrid Risk Assessment Model for Cardiovascular Disease Using Cox Regression Analysis and a 2-means clustering algorithm

Comput Biol Med. 2019 Oct:113:103400. doi: 10.1016/j.compbiomed.2019.103400. Epub 2019 Aug 27.

Abstract

Cardiovascular disease (CVD) refers to a state that indicates narrowed or blocked blood vessels, and it can lead to cardiac arrest, chest pain (angina) or stroke. CVD is a leading cause of silent massive heart attacks and is a major threat to life. The mere prediction of the presence or absence of CVD alone is inefficient in current scenarios. Rather, a major need has arisen for the prediction of CVD, the acquisition of knowledge about CVD and the assessment of the likelihood that an individual will experience cardiac arrest. The objective of establishing an individual CVD risk assessment has been attained in this paper using a hybrid model. The CVD of an individual is due to various controllable and uncontrollable factors. The computation and analysis of all these factors are difficult and time consuming. Only a few attributes are identified to be the most critical. This optimization of the critical features is performed using a modified Differential Evolution (DE) algorithm. The identified critical factors are sufficient to predict the presence/absence of CVD. In this paper, these identified critical features of individuals are considered using Cox regression analysis that evaluates the prevalence rates of the critical attributes. These individual prevalence rates together predict the cumulative prevalence ratios of the respective individuals. This cumulative prevalence ratio of an individual, along with the class attribute, is processed using the 2-means clustering technique to determine the risk of a particular individual developing CVD. The evaluation of the risk assessment model is carried out in this paper by calculating the prediction accuracy of the Cox regression analysis and the Davies-Bouldin (DB) index for 2-means clustering. The Cox regression analysis results in a 91% CVD prediction accuracy using the critical attributes and is comparatively higher than that of other models. The DB index of 2-means clustering with specific initial means for clusters of individuals with CVD is 0.282 and that for clusters of individuals without CVD is 0.2836, which are comparatively lower than those of the traditional k-means clustering algorithm.

Keywords: Cardiovascular disease; Cox regression; K-means clustering; Modified differential evolution; Risk assessment.

MeSH terms

  • Aged
  • Algorithms*
  • Cardiovascular Diseases* / pathology
  • Cardiovascular Diseases* / physiopathology
  • Cluster Analysis
  • Female
  • Humans
  • Male
  • Middle Aged
  • Models, Cardiovascular*
  • Risk Assessment