Effect of ferritin, INR, and D-dimer immunological parameters levels as predictors of COVID-19 mortality: A strong prediction with the decision trees

Background and objective A hyperinflammatory environment is thought to be the distinctive characteristic of COVID-19 infection and an important mediator of morbidity. This study aimed to determine the effect of other immunological parameter levels, especially ferritin, as a predictor of COVID-19 mortality via decision-trees analysis. Material and method This is a retrospective study evaluating a total of 2568 patients who died (n = 232) and recovered (n = 2336) from COVID-19 in August and December 2021. Immunological laboratory data were compared between two groups that died and recovered from patients with COVID-19. In addition, decision trees from machine learning models were used to evaluate the performance of immunological parameters in the mortality of the COVID-19 disease. Results Non-surviving from COVID-19 had 1.75 times higher ferritin, 10.7 times higher CRP, 2.4 times higher D-dimer, 1.14 times higher international-normalized-ratio (INR), 1.1 times higher Fibrinogen, 22.9 times higher procalcitonin, 3.35 times higher troponin, 2.77 mm/h times higher erythrocyte-sedimentation-rate (ESR), 1.13sec times longer prothrombin time (PT) when compared surviving patients. In addition, our interpretable decision tree, which was constructed with only the cut-off values of ferritin, INR, and D-dimer, correctly predicted 99.7% of surviving patients and 92.7% of non-surviving patients. Conclusions This study perfectly predicted the mortality of COVID-19 with our interpretable decision tree constructed with INR and D-dimer, especially ferritin. For this reason, we think that it may be important to include ferritin, INR, and D-dimer parameters and their cut-off values in the scoring systems to be planned for COVID-19 mortality.


Introduction
COVID-19 is an infectious disease caused by SARS-CoV-2 infection. Since the World Health Organization (WHO) declared the SARS CoV2 infection as a pandemic, the epidemic still maintains its severity [1,2]. Previous studies showed that this disease can accompany multi-organ dysfunction and cause a range of symptoms [1,3,4].
While many studies indicate that COVID-19 disease may be asymptomatic, it has also been reported that the disease may be associated with severe ARDS, which is thought to be due to inflammation [4][5][6]. In addition, Chalmers et al. [7] pointed out that the most important cause of coronavirus deaths is the intense and unbridled release of proinflammatory cytokines. However, efforts to define and treat the relationship structure between COVID-19 disease and hyperinflammation continue [8]. Additionally, some studies have highlighted that COVID-19 may be a manifestation of broader hyperinflammatory conditions such as secondary hemophagocytic lymphohistiocytosis (sHLH) that can be characterized by cytokine release syndrome (CRS) [8][9][10][11]. In particular, one of the most important features of these syndromes is hyperferritinemia [8]. Ferritin, known as the essential intracellular iron storage protein, is a positive acute phase reactant that rises in many inflammatory conditions [12,13]. In addition, it was stated that very high ferritin levels may cause diseases such as macrophage activation syndrome and septic shock [8,13]. Similarly, many studies have shown the importance of the immunomodulatory effect of ferritin in mortality and inflammatory processes [8,[12][13][14]. In addition, it has been reported in many studies that ferritin may be a direct mediator of the immune system as a signal molecule and its distinctive feature in hyperferritinemia syndromes has been emphasized [7,12,[15][16][17].
High levels of ferritin, D-dimer, lactate dehydrogenase, and IL-6 were reported to be an indicator of poor prognosis and risk of COVID-19 mortality [11,12,18]. Efforts to treat COVID-19 have included testing various anti-inflammatory biological agents to inhibit this potent immune response [12,19,20]. Kurniawan et al. [21] reported that hyper inflammation, coagulation cascade, multi-organ failure, which play a role in the etiopathogenesis of COVID-19, and biomarkers such as CRP, D-Dimer, LDH, and albumin associated with these conditions may be useful in predicting the outcome of COVID- 19.
In a meta-analysis, Hariyanto and Kurniawan [22] found a high correlation between obstructive sleep apnea (OSA) and poor outcomes of COVID-19 patients and stated that OSA increases the mortality of the disease. Hariyanto et al. [23] found a high level of association between Epilepsy and increased severity of COVID-19 and death from COVID-19 in a meta-analysis. Hariyanto et al. [24] found in a meta-analysis that Dementia is associated with advanced and fatal COVID-19 infection. Severe COVID-19 causes hypoxia, which increases the risk of thrombosis due to increased blood viscosity and hypoxia-inducible transcription factor-dependent signaling pathways [21].
In recent years, we see that artificial intelligence technologies, which have been used with great success in many fields, are increasingly used in the diagnosis and prognosis of diseases and the improvement of treatment processes, especially in the field of medicine [25]. The most important reason for this is the power of machine learning (ML) algorithms, which is considered an important part of artificial intelligence (AI) technologies, to reveal the hidden relationships (machine learning -feasibility) between patterns [2,26,27]. In this context, ML approaches, also applied to medical science issues, are rapidly being developed due to their high performance in predicting outcomes, reducing drug costs, improving patient health, and making real-time decisions to improve healthcare value and quality [28][29][30].
The artificial intelligence-based on chi-squared automatic interaction detection (CHAID) method is frequently used in various disciplines to predict how some variables affect other variables, as well as other early applications in medical and psychiatric research [34]. Decision tree methods use multipath splits as an assumption. In addition, in studies with small sample sizes, the level of reliability of the answered groups decreases rapidly [35]. Therefore, a large sample is needed for decision trees to work effectively. An important advantage of CHAID analysis according to the alternative methods such as multiple regression is that it is a non-parametric method [36]. Therefore, it is resistant to problems with multicollinearity, outliers, distribution, structure, and missing data [37]. CHAID analysis is a method that can model categorical and continuous variables together and provides highly accurate classification or estimation performance in large samples [28,38]. With this method, the relationships between the independent variables are detailed, and easy, understandable, and interpretable outputs in the form of trees can be obtained even in the most complex models [28]. In addition, since variable combinations are also examined in this method, interactions can be also evaluated. Decision tree inductors are algorithms that automatically generate a decision tree from a dataset. This algorithm aims to obtain the optimal decision tree by minimizing the generalization error. However, it is important to define other target functions that affect the performance of the decision tree and to determine the number of nodes and the average depth. The induction of an optimal decision tree from data is considered a difficult task [39].
Previous AI studies did not use many of the immunological parameters to predict mortality from COVID-19 and reported relatively lower classifier performance than this study [2,3,5,6]. Moreover, previous studies [9,[18][19][20]25,[31][32][33] have generally focused on the early detection of COVID-19 disease and have addressed relatively small samples. In addition, artificial intelligence studies that predict the mortality of the disease based only on immunological parameters and detect patients with a high probability of dying in the early phase are insufficient.
In particular, the usefulness and effective breakpoints of ferritin and D-dimer in predicting COVID-19 mortality have not yet been fully established [7,12,32,33]. Therefore, the predictive role of ferritin level as a pro-inflammatory factor in the uncontrolled cytokine storm at risk of poor outcome in COVID-19 patients needs further confirmation [7,11]. In addition, we found that a limited number of studies have been conducted on the effect of ferritin in the prediction of COVID-19 mortality and that these studies cover the early stages of the epidemic. In addition, the statistical power of the predictive indicators for the cut-off values of ferritin, which was used in the detection of severe patients in previous studies, was relatively lower than in this study.
In this study, it was determined which immunological parameters are the most important source of variation affecting COVID-19 mortality with the CHAID decision tree from ML models. In this context, important critical levels of other immunological parameters especially serum ferritin, which affects COVID-19 mortality, were calculated and their importance was interpreted to detect surviving and non-surviving patients and to monitor the poor prognosis of the disease. Accordingly, the success of our interpretable decision tree, which was created only with the immunological values in this study, in the detection of surviving and non-surviving patients was calculated. We think that this study will help in the identification of severely infected COVID-19 patients with a high probability of death and will make important contributions to the determination of the poor prognosis of the disease.

1. Participant's criteria, study design
The data used in this retrospective cohort study were collected digitally from the Erzincan Binali Yıldırım University Mengücek Gazi Training and Research Hospital information system between August and December 2021, and the data conforming to our criteria were included in the study. The data used in this study includes the information of 2568 patients who were hospitalized and treated only with the diagnosis of COVID-19 between the specified dates. Patients who were not diagnosed with COVID-19 on the specified dates and were less than 18 years old were excluded from the study. In our hospital, COVID-19 was diagnosed in cases with SARS-CoV-2 detected only in nasopharyngeal or oropharyngeal swabs by RT-PCR. Age and gender characteristics of detected COVID-19 patients were recorded. Immunological test records of these patients from admission to discharge, as well as mortality and survival, were examined. The data of the patients were divided into two groups according to their exit information: those who died from COVID-19 (n = 232), and those who recovered from COVID-19 and survived (n = 2336). Patients who died from COVID-19 were defined as severely ill, of whom 90% were intensive care patients. Since our study was a retrospective scan, the comorbidity data of most of the patients could not be accessed.
In this study, demographic data and immunological blood values of patients who died and survived COVID-19 were compared first. Then, CHAID analysis, which was one of the ML decision trees methods, was used to determine the effectiveness of the immunological blood values in estimating the mortality of COVID-19. Our interpretable decision tree obtained via CHAID analysis identified the most important immunological parameters influencing COVID-19 mortality. In addition, the cut-off values of the most effective parameters in the estimation of the mortality of the disease were calculated and the relationship structure between the parameters was examined. The complexity matrix was used to evaluate the performance of the decision tree. According to this matrix, the success of the obtained decision tree in predicting patient groups was calculated. In addition, the changes in the ferritin, INR, and D-dimer parameters used in the construction of our decision tree in the following days during the treatment process were recorded. These values of the patient groups were calculated as average and were examined until the exit of the patients.

Immunological routine laboratory parameters used in the study
C-reactive protein (CRP), D-dimer, ferritin, fibrinogen, international normalized ratio (INR), Prothrombin time (PT), Procalcitonin (PCT), erythrocyte sedimentation rate (ESR), Troponin, and activated partial prothrombin time (aPTT) values measured at admission to the hospital were included in the study. Ferritin was assessed by an immunoassay of chemiluminescence (Centaur XP, Siemens Healthcare, Germany). Prothrombin time (PT), activated partial prothrombin time (aPTT) and fibrinogen were determined with a completely digital coagulation instrument of Ceveron-Alpha (Diapharma Group Inc., West Chester, Canada). C-reactive protein (CRP) was measured with the nephelometric method on BNTM II System (Siemens, Munich, Almanya). Procalcitonin (PCT), D-dimer, and Troponin I were analyzed from the whole blood on the AQT90 flex RadiometerVR (Bronshoj, Denmark). The erythrocyte sedimentation rate (ESR) was measured using a TEST 1 BCL device (Alifax, Padova, Italy) based on the principle of photometric capillary flow kinetic analysis.

Data extraction and workflow
Between the specified dates, the diagnostic criteria of 80 thousand patients were examined in our hospital registry system and the patients who were diagnosed with COVID-19 as a result of the RT-PRCR test were filtered. In the registry, there were about 50 routine blood values (RBV), exit information, gender, age, and various underlying suspected disease diagnoses of COVID-19 patients. The comorbidity data of patients diagnosed with COVID-19 were analyzed. However, these data were not used in this study, as comorbidity data for these patients was largely missing (more than 90%). Hematological and biochemical routine blood values were not included in the study because they did not comply with the concept of the study. Outliers in the data set were normalized by filling the missing data in the recorded immunological parameters with the average of the relevant parameter distribution. We abandoned features with 30% or more missing data from the recorded immunological parameters, and only 10 features were qualified and used in the study. Ultimately, in our dataset, patients who lived and died from COVID-19 had 10 immunological test results, age, and gender. The workflow of this study is summarized in Fig. 1.

Statistical analysis
Categorical variables were expressed as frequency and percentage, while median and quartile values of continuous variables were presented. The Shapiro-Wilk test was used to confirm the normality of the distributions of quantitative variables. In the comparison of continuous variables, normally distributed variables were analyzed with the independent sample t-test, and those that were not normally distributed were analyzed with the Mann-Whitney U test. Categorical variables were analyzed with the χ2 test. CHAID analysis, one of the ML decision tree methods, was used to evaluate the predictive immunological parameters in the estimation of the mortality of COVID-19 disease. SPSS (version 20.0, SPSS Inc, Chicago) package programs were used for statistical analysis of the data. p < 0.05 was considered statistically significant. CHAID analysis, one of the ML decision tree methods, was used to collectively evaluate the predictive immunological parameters in the estimation of the mortality of COVID-19 disease.
The following statistical model specifications and stopping criteria were applied in the CHAID decision tree analysis: (1) Significant level for separation nodes was determined as p < 0.05.
(2) The Bonferroni method was used to obtain significant correction values.
(3) The minimum change in expected cell frequencies was 0.001.
(4) Pearson χ2 independence test was used to determine the relationships between independent (explanatory) variables.
(5) Model depth is set to maximum. (6) A 10-fold cross-validation was used to evaluate the tree structure and (7) Misclassification risk was calculated as a measure of model reliability.
Decision trees are considered one of the most popular approaches to representing classifiers. Comprehensive CHAID model estimation starts with the entire sample (called "parent node/root") and then divides the master nodes into meaningful homogeneous subgroups ("child nodes/leaf") according to a certain discrete function. Splitting continues until predetermined stopping criteria are met. Each leaf is assigned a class that represents the most appropriate target value. Specimens are classified according to the results of the tests by going through all the leaves from the root of the tree.
This algorithm uses the χ2 independence test to determine the relationships between the independent (explanatory) variables and then selects the explanatory variables that best explain the dependent (response) variable based on the "IF-THEN" logic. To obtain significant correction values, CHAID analysis using the Bonferroni test can be used for prediction (similar to regression analysis), classification, as well as detection of interactions between variables [35,40].

Demographic characteristics and immunological tests
The demographic characteristics of patients who died from COVID-19 or survived after treatment and the immunological blood values of all patients at admission to the hospital are shown in Table 1. While 2336 (90.9%) of 2568 patients in this study survived, 232 (9.1%) died. While 142 (61.2%) of the patients who lost their lives were male, 90 (38.8) were female. While the mean age of the surviving patients was 55.00, it was 76.00 for the patients who died.
Chi-square test results and gender were determined to be associated with disease mortality (p < 0.001). According to the results obtained, the male gender was seen as a factor in increasing mortality. In addition, the age of the patients who died was significantly higher than the patients who survived (p < 0.001). Accordingly, advanced age was found to be associated with the mortality of the disease (Table 1).
When Table 1

M.T. Huyut and Z. Huyut
However, activated partial prothrombin time (aPPT, 35.34 against 31.00 x sec) in deceased patients was lower than in surviving patients.

CHAID decision tree for detecting surviving and non-surviving from COVID-19
The CHAID decision tree model created to determine the predictive immunological variables affecting the mortality of COVID-19 patients was presented in Fig. 2. These values were the values measured at admission to the hospital. When the decision tree diagram in Fig. 2 is examined, 232 (9.0%) of 2568 patients included in the study died, while 2336 (91.0%) of them recovered. As seen in Fig. 2, the decision tree diagram estimating the mortality of COVID-19 patients consisted of only the immunological parameters ferritin, INR, and D-dimer. The presence of ferritin and subsequently INR and D-dimer at the root of the decision tree can be interpreted as the robustness of the approach to estimating disease mortality and indicates the clinical accuracy of our decision tree.
When the decision tree was examined, it was seen that the variable with the most important effect on the mortality of the disease was "ferritin" (χ2 = 2269.777 and p < 0.001). The patients were divided into three different groups according to the ferritin parameter, whose cut-off values were determined for the estimation of the mortality of COVID-19. Also, the ferritin value could not be reached in 34.3% (n = 882) of the patients included in the study. However, those were living patients. Accordingly, the first group had ferritin values < 389,100 μg/L and the mortality of this group was 1.3%. The second group had ferritin values in the range of 389,100 μg/L to 395,000 μg/L and the mortality of this group was 96.4%. The third group had ferritin values > 395.00 μg/L and the mortality of this group was 0.7%. According to these results, when the ferritin value of the third group is not taken into account, it can be said that there is a great increase in the mortality rate with the increase in the ferritin value compared to the first two groups (1.3% vs. 96.4%). Since the first two groups constitute the majority of the study, according to these data, we can say that the increase in ferritin values, in general, induces COVID-19 mortality. Although the third group's values seem to contradict the values of these two groups, the findings do not contradict each other. Because these data are the values measured at admission to the hospital, there was probably a decrease in the ferritin values of most of the patients living in the third group in the following days. According to this result, it can be said that increasing the D-dimer level induces mortality (0.8% vs. 2.8%).
Within the scope of the study, in detecting patient groups, the overall accuracy (success rate) of our decision tree model, which was created with Ferritin, INR, and D-dimer immunological variables to predict the mortality of COVID-19 was found at 99.0% and this data was statistically significant. (Fig. 2, Table 2). Accordingly, our decision tree model correctly classified 99.7% of the patients who survived and 92.7% of those who died, taking into account the cut-off values of INR and D-dimer parameters, especially ferritin. In addition, detecting surviving and non-surviving patients with a sensitivity and specificity value of over 90% showed that our decision tree is extremely robust and works with high performance even in unbalanced datasets (Table 3).
In addition, the changes in the ferritin, INR, and D-dimer parameters used in the construction of our decision tree on the days after hospital admission were presented in Fig. 3 according to the patient groups. Accordingly, Ferritin (except 5 and 6 days), D-dimer, and INR levels were higher in the group that did not survive admission to the hospital until discharge in the following days compared to the living group. However, it was remarkable that ferritin values were higher in patients living on the 5th and 6th days (Fig. 3).

Discussion
COVID-19 has attracted attention for its higher infectivity and mortality than influenza [1,3,41]. The majority of infected patients had mild symptoms. However, the prevalence of symptoms such as severe pneumonia, acute respiratory distress, and multi-organ dysfunction among infected patients was much lower [17,38,42]. In addition, the need to determine the immune status of  especially risky patients and to determine biomarkers that affect the mortality of the disease continues [1,43]. Therefore, early detection of patients who require special care and with high death expectancy and effective identification of relevant biomarkers are important in terms of reducing the mortality of the disease [11,28,36,38]. In this study, the importance of monitoring immunological parameters by using decision trees in predicting the mortality of COVID-19 at acceptance and during hospitalization was evaluated. Our decision tree which was constructed with INR, D-dimer, and especially ferritin, perfectly predicted the mortality of COVID-19 (Fig. 2). In addition, the presence of ferritin and subsequently INR and D-dimer at the root of our decision tree revealed to support the clinical accuracy of our interpretable decision tree for estimating COVID-19 mortality.
In this study, 91% (n = 2336) of COVID-19 patients survived, while 9% (n = 232) of patients died. While 61.2% (n = 142) of the patients who died were male, 38.8% (n = 90) of them were female. While the mean age of the patients who survived was 55.00, it was 76.00 years for the patients who died (Table 1). According to the results, male gender and advanced age were found to be demographic characteristics that increase mortality (Table 1). Zhou et al. [18] found the COVID-19 mortality to be 28.3%, while Guan et al. [44] found it to be 1.4%. In other studies, mortality from COVID-19 was found to be 45%, 4.3%, and 28.4% [36,45,46]. The different mortality outcomes of COVID-19 may have resulted from the different sample sizes and case inclusion criteria used in the studies. Also, in this study, the increase in the severity and mortality of the disease by high age and male gender were found to be compatible with the literature [18,19,36,45].
Banerjee et al. [47] found the most successful glmnet model (92% sensitivity, 91% accuracy) in detecting patients among covid positive (n = 81) and healthy individuals (n = 517) using 14 RBV values. Huyut [48] ran 11 supervised machine learning models and used 28 RBV values to detect ICU patients and non-ICU patients in his study to determine the prognosis of COVID-19. In that study, Huyut [48] found the most successful models in identifying mildly infected patients, the local weighted learning model with 0.95%-AUC and the Kstar model with 0.91%-AUC. Brinati et al. [49] used 13 RBV features for the detection of COVID-19 with various classification models, and the random forest and logistic regression model were the models with the highest accuracy (82% and 78%, respectively). Huyut and Velichko [50] operated 51 RBV values with the LogNNet deep neural network model to quickly, economically and safely detect the diagnosis and prognosis of COVID-19 disease. They achieved 99% accuracy in the diagnosis of the disease and 83% accuracy in the determination of the prognosis of the disease. Zhang et al. [51] used various demographic and RBV with LASSO-based artificial neural network (ANN) models with the least absolute shrinkage and operator of choice (LASSO) regression to identify predictors of COVID-19 mortality. The success of the LASSO regression model and the LASSO-based ANN model that was run with the 9 prognostic factors they identified for mortality, in determining the clinical status of patients was 98%-AUC [51].
Zhu et al. [52] applied a deep learning algorithm to predict the mortality of COVID-19 by using 78 features from demographic, clinical, and laboratory tests. They found the method's performance in estimating COVID-19 mortality as 95.4%-AUC [52]. Alle et al. [53] applied 70 features from clinical and laboratory tests to various machine learning models to predict COVID-19 mortality, and  Fig. 3. Dynamic changes of three immunological parameters that were effective in the prediction of COVID-19 mortality in the thirty days after hospitalization.
XGboost and logistic regression models performed best in mortality prediction, respectively (83%-AUC vs 92%-AUC). In addition, Alle et al. [53] found that serum ferritin was the most important predictor of COVID-19 mortality, via the community-averaged Bayesian network. For the prediction of critical COVID-19 patients, Gao et al. [54] developed an ensemble model derived from a Support Vector Machine (SVM), Gradient Augmented Decision Tree (GBDT), and Neural Network (NN) algorithms by using immune-inflammatory properties [54]. They found AUC-0.99 the success of the developed model in detecting severely infected patients. Vaishnav et al. [55] used various machine learning models to predict COVID-19 deaths in the country of India, and the decision tree regression model produced an accuracy of 70% and the random forest regression model produced an accuracy of 76%. Booth et al. [56] developed a machine learning model using five routine laboratory parameters to identify prognostic serum biomarkers in COVID-19 patients at risk of death, and the model predicted mortality with 91% sensitivity and 91 specificity. Kukar et al. [57] created a machine-learning model to diagnose 5333 negative and 160 COVID-19-positive patients with various bacterial and viral infections based on routine blood tests. They found the classification success of the model as 81.9% sensitivity and 97.9% specificity [57]. In a study using tomography data and some RBV features, Mei et al. [58] proposed a model combining CNN and multilayer sensors and found the success of the model in diagnosing COVID-19 with 84% sensitivity and 83% specificity. In a study was conducted for the diagnosis of COVID-19, Soares [59] proposed a model combining SVM, ensembling and SMOTE Boost models which was using 15 RBV parameters, and he found the success of the model in diagnosis with 86% specificity, 70% sensitivity. In another study conducted for the diagnosis of COVID-19, using various RBV parameters, Soltan et al. [60] found the XGBoost method to be the most successful model with 85% sensitivity and 90% accuracy. Velichko et al. [61] studied a large patient population with 13 different ML classifier models to detect COVID-19, and the histogram based gradient boosting (HGB) model was the most successful, detecting all patients with 100% accuracy. Also, in that study, they implemented the internet of things (IoT) in online and offline mode for the diagnosis of COVID-19. Huyut et al. [62] used 16 different classifier models and RBV values on patients who died (n = 233) and recovered (n = 2364) from COVID-19 to detect important biomarkers that affect the mortality of COVID-19 disease and to determine the risk levels of these features. The most successful model in predicting the mortality of the disease with only three features (D-dimer, ESR, Direct Bilirubin) was the histogram based gradient enhancement (HGB) model with precision >0.98 and recall >0.98 [62]. Huyut and ˙I lkbahar [38] used the CHAID decision tree algorithm to detect COVID-19 positive and negative patients and severely and mildly infected COVID-19 patients by using various biomarkers. They found that the performance of the decision tree had an 81.6% in determining the diagnosis of the disease and 93.5% in determining the prognosis of the disease as accuracy rate [38]. Huyut and Üstündag [28] used the CHAID decision tree algorithm to detect COVID-19 positive and negative patients and severely and mildly infected COVID-19 patients using various blood gas parameters. They found the performance of the decision tree to be 68.2% in determining the diagnosis of the disease and 65.0% in determining the prognosis of the disease and an accuracy rate [28]. Using various demographic and clinical features, Doganay et al. [36] determined the mortality of COVID-19 by using the CHAID decision algorithm with an accuracy rate of 81.0%. Al-Najjar et al. [63] used various machine-learning algorithms to detect patients who recovered and died from COVID-19. They found the accuracy of the CHAID decision tree algorithm to be 82.0% for the training set and 78.0% for the test set [63]. Guerrero et al. [64] examined the compliance of Canadian children and youth with 24-h movement rules during the COVID-19 outbreak using the CHAID algorithm in four different categories. They found accuracy rates ranging from 70% to 97.4% with this algorithm [64]. Feigin et al. [65] identified age-related biomarkers by using the CHAID algorithm to predict in-hospital mortality in COVID-19 patients. Kazawa et al. [66] identified characteristics of early-career nurse researchers who were adversely affected during the COVID-19 epidemic using the CHAID algorithm. Sacco et al. [67] identified the symptoms and chronological aspects of the spread of the SARS-CoV-2 virus among both residents and caregivers in a nursing home by using the CHAID algorithm. Seddawy and Mohamed [68] analyzed the impact of COVID-19 on e-commerce and online shopping by using the CHAID decision tree. They suggested that the CHAID algorithm can be used to determine the customer's purchasing preference, with a sensitivity of 89.09% and a classification error of 10.9% [68].
Ferritin has been suggested to be a useful marker for predicting patient outcomes in patients with COVID-19 [12]. Numerous studies are showing that higher ferritin levels, along with other proinflammatory markers, including CRP and IL-6, are associated with worse outcomes [12,18,69]. In another study, high levels of inflammatory markers such as CRP, erythrocyte sedimentation rate, IL-6, and procalcitonin in COVID-19 patients were reported to indicate hyperinflammatory reactions in COVID-19 patients [11]. Gandini et al. [70] noted an extremely elevated ferritin level (mean serum ferritin >500 μg/L) in all patients at the time of admission of severe COVID-19 patients, and reported that milder cases had a mean ferritin level of 303 ± 224 μg/mL. In addition, Gandini et al. [70] and Gao et al. [71] noted that the ferritin level of severe and intensive care COVID-19 patients was 2.6-fold and 5.8-fold higher than in mild patients, respectively. Dahan et al. [72] observed higher ferritin levels in middle to heavly COVID-19 patients than in slightly patients (mean ferritin level in mild patients 327.27 ng/mL, in middle patients 1555 ng/mL, in heavly patients 2817.6 ng/mL). A meta-analysis of data from 57563 COVID-19 patients and 189 observational studies reported a significant difference in mean ferritin levels between survivors and died [73]. Another meta-analysis involving 5350 patients showed that high ferritin led to a poor outcome in COVID-19 and was associated with the development of ARDS [74]. Taneri et al. [73] stated that plasma exchange, high-volume hemofiltration, and desferrioxamine could be used to reduce ferritin levels in COVID-19 patients. In addition, some studies reported that these therapies are currently used in the treatment of sepsis and macrophage activation syndrome [74][75][76]. In this study, patients who died from COVID-19 had 1.75 times higher ferritin values than patients who survived. In addition, the overall accuracy of the CHAID decision tree created to identify the predictive immunological variables affecting the mortality of COVID-19 patients in this study was 99% in classifying all patients (Table 2). Moreover, our decision tree correctly predicted 99.7% of living patients and 92.7% of deceased patients. Furthermore, detecting surviving and non-surviving patients with a sensitivity and specificity value of over 90% showed that our decision tree is extremely robust and works with high performance even in unbalanced datasets (Table 3).
In our interpretable decision tree, ferritin was found to be the most important immunological parameter for the mortality of COVID-19 (Fig. 2). In addition, patients with ferritin values of 389,10 μg/L to 395,00 μg/L had a significantly higher mortality rate than patients with ferritin values less than 389,10 μg/L (96.4% vs. 1.3%). These results showed that elevated ferritin values significantly increased mortality. In previous studies, high serum ferritin levels were associated with the development of mortality and serious outcomes in COVID-19 [49,77,78]. In this study, high ferritin values in deceased patients were seen as an important risk factor that greatly increased the mortality of the disease. However, the differences in ferritin levels obtained between severely and mildly infected patients in other studies suggested that it may be due to different reasons such as the number of samples included in the study, the infection period of the patients, and secondary bacterial infections. Increased D-dimer is common in COVID-19 patients and can be attributed to sepsis-induced coagulopathy. In addition, D-dimer may reflect a higher thromboembolic risk in severe cases of COVID-19 [79,80]. In one study, D-dimer levels were found to be significantly higher in severe patients than in non-severe COVID-19 patients and were associated with severe COVID-19 disease [81]. A meta-analysis of 5872 COVID-19 patients found that higher D-dimer concentrations were associated with increased disease severity and mortality in COVID-19 patients [79]. In addition, one study reported that D-dimer (>2.0 mg/L) at admission increased COVID-19 mortality [81]. In another study, it was reported that a reduction in D-dimer levels was observed in recovered patients independent of anticoagulant therapy, while a sustained increase in D-dimer levels may be predictive of a higher risk of thromboembolism and adverse outcomes [82,83]. In a meta-analysis to determine the predictors of COVID-19 severity, Kurniawan et al. [21] found an average of 36.88 mg/L higher CRP, an average of 0.07 ng/mL higher procalcitonin, an average of 0.043 μg/L higher D-dimer, and an average lower -4.58 g/L albumin in severe patients when compare mild patients. In that study, they found AUC for CRP, procalcitonin, LDH, D-dimer, and albumin in the detection of severe COVID-19 patients to be 92%, 89%, 84%, 83%, and 82%, respectively [21]. However, the cut-off values for the features cannot be said to be very successful due to the low sensitivity values of these features and the significant difference between the sensitivity-specificity results of most features. Another study reported that monitoring dynamic variations of D-dimer can be a useful diagnostic tool to predict the prognosis of patients with COVID-19 and peak D-dimer levels may be strongly associated with mortality in patients with COVID-19 [84,85]. In a recent study, Winata and Kurniawan [86] reported a significant increase in D-dimer and fibrinogen degradation products (FDP) in advanced stages of all severe COVID-19 patients with a poor prognosis. Also, they stated that these significant increases in D-Dimer and FDP levels occur due to increased hypoxia in severe and advanced stages COVID-19 conditions, and that increases in these values may be significantly associated with coagulation.
In a meta-analysis study examining the relationship between INR and COVID-19 severity and mortality, it was noted that INR values were significantly increased in dead and severely infected COVID-19 patients [87]. Similarly, previous studies have reported that patients with COVID-19 who died have a higher INR and it was considered a predictor of mortality [85,86]. In this study, D-dimer levels were 2.4 times higher and INR levels were 1.14 times higher in the patients who died from COVID-19 compared to the patients who survived. In addition, in our decision tree, the most important immunological parameter for mortality was INR in patients with ferritin values less than 389.10 μg/L (Fig. 2). The mortality rate was 15.1% in patients with an INR higher than 1,190, while this rate was 0.2% in patients with an INR less than 1190. In addition, D-dimer was found to be the most important immunological parameter for mortality in patients with ferritin values less than 389.10 μg/L and INR less than 1,19 (Fig. 2). Furthermore, the mortality rate was 2.8% in patients with a D-dimer level higher than 1239.40 μg/L, while the mortality rate was zero in patients with a D-dimer level lower than 1239.40 μg/L. According to these results, elevated INR and D-dimer levels increased mortality. In addition, the changes in the ferritin, INR, and D-dimer immunological tests, which were found to be the most effective in the prediction of COVID-19 mortality (Fig. 2), from their values at admission to exit were examined (Fig. 3). Accordingly, the ferritin, INR, and D-dimer values of the patients who died were consistently higher than those of the patients who survived. Another interesting point in our study was that the mortality rate was "zero" in patients with a ferritin value below 389.10 μg/L, an INR value below 1.19, and a D-dimer value below 1239.40 μg/L (Fig. 1). This result was an important indication that this parameter and cut-off values should be included in the scoring systems to be planned for COVID-19 mortality.
Elevated serum CRP levels are key markers of disease progression and a risk factor for mortality in severe COVID-19 patients and have been reported to be indicative of a developing cytokine storm in COVID-19 patients [76,88]. It has been reported that 20 out of 32 studies showed that the risk of poor outcomes was approximately four times higher in COVID-19 patients with high CRP [89]. Also, several studies have found higher levels of procalcitonin in severe COVID-19 patients compared to non-severe patients [31,38,49]. In another study, high procalcitonin levels were found in 85 of 290 patients and they were associated with mortality in COVID-19 patients [89]. In another study, increased procalcitonin values were associated with an approximately 5-fold higher risk of severe SARS-CoV-2 infection [17]. In many studies, they reported that patients who died from COVID-19 had significantly higher levels of CRP, D-dimer, and procalcitonin than surviving patients, and they found that these parameters were associated with the mortality of the disease [4,6,18,31,38,51,88]. Similarly, in many studies, higher CRP, Fibrinogen, D-dimer, troponin, and procalcitonin levels were reported in severely infected and deceased COVID-19 patients compared to living patients [31,38,90,91]. In this study, the CRP (10.70 mg/L fold), fibrinogen (1.10 g/L fold), PT (1.13-sec fold), procalcitonin (22.90 μg/L fold), ESR (2.77 mm/h fold), and troponin (3.35 ng/L fold) values in the non-survivng patients were higher than in the surviving patients. In addition, aPPT (0.88-sec fold) in patients who died was lower than in patients who survived (Table 1). However, in our decision tree (Fig. 2) that was created to determine the immunological parameters on mortality of COVID-19 were not observed a significant effect of fibrinogen, procalcitonin, troponin, and aPPT parameters. According to these results, we can say that INR, D-dimer, and especially ferritin have a more important effect on the prediction of mortality in COVID-19 compared to other immunological parameters.

Limitations of the study
Our study has potential limitations due to its retrospective nature. In this study, only the immunological parameter values obtained M.T. Huyut and Z. Huyut at admission were used to estimate the mortality of the disease. Therefore, how the ferritin values on the days after admission predicted mortality was not investigated. In addition, comorbidity data of the patients were not available. Finally, the patients' pre-COVID-19 ferritin levels were not known, so they could not be compared with post-COVID-19 ferritin values. In addition, it is recommended to test the findings in this study with external datasets in estimating the mortality of COVID-19.

Conclusion
Identifying risk factors and thresholds for severe and critical COVID-19 patients is important for early and rapid clinical intervention. However, determining the severe infection status of COVID-19 patients using various diagnostic tests and imaging results is costly and time-consuming. In addition, different complications may occur during the procedure. In this case, the patient's health status may be more at risk and the healthcare providers may be under more pressure and tragic situations may occur.
In this study, the characteristics that affect the mortality of severely infected COVID-19 patients and their critical importance levels were determined by the CHAID decision tree, based only on the immunological/inflammatory RBV values measured at hospital admission. The most important features affecting COVID-19 mortality were found to be INR, D-dimer, and especially serum ferritin. The overall accuracy rate of our interpretable decision tree, which was created with only these three features, in classifying COVID-19 groups was found to be 99.0%. In addition, our decision tree correctly identified 99.7% of patients who survived and 92.7% of patients who died.
In this study, 94% of the patients who died from COVID-19 had ferritin levels between 389.10 μg/L and 395.00 μg/L, indicating that this ferritin value range is a critical level for the mortality of COVID-19. In addition, patients with an INR above 1.19 had a much higher mortality rate than the others. Moreover, the mortality rate was 2.8% in patients with D-dimer levels higher than 1239.40 μg/L, while the mortality rate was zero in patients with a D-dimer level lower than 1239.40 μg/L. According to these results, it can be concluded that high INR, D-dimer, and especially ferritin levels significantly increase COVID-19 mortality. We suggest that these features and cut-off values can be used as important biomarkers in the prediction of COVID-19 mortality.

Ethical approval
Institutional Review Board Statement: The dataset used in this study was collected to be used in various studies in the estimation of the diagnosis, prognosis, and mortality of COVID-19. The necessary permissions for the collected dataset were given by the Ministry of Health of the Republic of Turkey and the Ethics Committee of Erzincan Binali Yıldırım University. This study was conducted following the 1989 Declaration of Helsinki. Erzincan Binali Yıldırım University Human Research Health and Sports Sciences Ethics Committee.

Informed consent statement
In this study, a dataset including only routine blood values, RTPCR results (positive or negative), and treatment units of the patients was downloaded retrospectively from the information system of our hospital in a digital environment. A new sample was not taken from the patients. There is no information in the dataset that includes identifying characteristics of individuals. It was stated that routine blood values would only be used in academic studies, and written consent was obtained from the institutions for this. In addition, therefore, written informed consent was not administered to every patient.

Author contribution statement
Mehmet Tahir HUYUT: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper. Zübeyir HUYUT: Analyzed and interpreted the data; Wrote the paper.

Funding statement
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of interest's statement
The authors declare no competing interests.