Facilitating Safe Discharge Through Predicting Disease Progression in Moderate Coronavirus Disease 2019 (COVID-19): A Prospective Cohort Study to Develop and Validate a Clinical Prediction Model in Resource-Limited Settings

Abstract Background In locations where few people have received coronavirus disease 2019 (COVID-19) vaccines, health systems remain vulnerable to surges in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. Tools to identify patients suitable for community-based management are urgently needed. Methods We prospectively recruited adults presenting to 2 hospitals in India with moderate symptoms of laboratory-confirmed COVID-19 to develop and validate a clinical prediction model to rule out progression to supplemental oxygen requirement. The primary outcome was defined as any of the following: SpO2 < 94%; respiratory rate > 30 BPM; SpO2/FiO2 < 400; or death. We specified a priori that each model would contain three clinical parameters (age, sex, and SpO2) and 1 of 7 shortlisted biochemical biomarkers measurable using commercially available rapid tests (C-reactive protein [CRP], D-dimer, interleukin 6 [IL-6], neutrophil-to-lymphocyte ratio [NLR], procalcitonin [PCT], soluble triggering receptor expressed on myeloid cell-1 [sTREM-1], or soluble urokinase plasminogen activator receptor [suPAR]), to ensure the models would be suitable for resource-limited settings. We evaluated discrimination, calibration, and clinical utility of the models in a held-out temporal external validation cohort. Results In total, 426 participants were recruited, of whom 89 (21.0%) met the primary outcome; 257 participants comprised the development cohort, and 166 comprised the validation cohort. The 3 models containing NLR, suPAR, or IL-6 demonstrated promising discrimination (c-statistics: 0.72–0.74) and calibration (calibration slopes: 1.01–1.05) in the validation cohort and provided greater utility than a model containing the clinical parameters alone. Conclusions We present 3 clinical prediction models that could help clinicians identify patients with moderate COVID-19 suitable for community-based management. The models are readily implementable and of particular relevance for locations with limited resources.

(LMICs) is the practical ceiling of care [5]. The World Health Organization (WHO) estimates that 15% of patients with symptomatic COVID-19 will require supplemental oxygen [6]. Effective identification of patients who are unlikely to become hypoxic would have considerable benefit; tools to support triage could decompress healthcare systems by giving practitioners confidence to allocate resources more efficiently [7].
Numerous prognostic models for COVID-19 have been developed [8,9]. Almost all predict critical illness or mortality and thus cannot inform whether a patient might be safely managed in the community. Of those that focus on patients with moderate disease, most rely on retrospective or registry-based data [10][11][12][13][14], lack external validation [15,16], and are not feasible for use in resource-limited settings [9,17]. Moreover, most existing studies did not follow best-practice guidelines for model building and reporting [18], are at high risk of bias [8], and the resulting models are neither suitable nor recommended for use in LMIC contexts [9].
We set out to develop and validate a clinical prediction model to rule out progression to supplemental oxygen requirement in patients presenting with moderate COVID-19. We hypothesized that combining simple clinical parameters with host biomarkers feasible for measurement in resource-limited settings and implicated in the pathogenesis of COVID-19 would improve prognostication.

Study Population
PRIORITISE is a prospective observational cohort study. Consecutive patients aged ≥ 18 years with clinically suspected severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection presenting with moderate symptoms to the All India Institute of Medical Sciences (AIIMS) Hospital in Patna, India, and the Christian Medical College (CMC) Hospital in Vellore, India, were screened (daytime hours, Monday to Saturday). AIIMS is a 1000-bed hospital and the largest medical facility providing primary-to-tertiary healthcare in the state of Bihar. CMC is a 3000-bed not-for-profit hospital that provided care for ~1500 patients with COVID-19 each day during the peak of the Delta-wave surge in India.
We adapted the case definitions in the World Health Organization (WHO) Clinical Management guideline (moderate disease) [6] and WHO Clinical Progression Scale (WHO-CPS; scores 2, 3, or 4) [19] to define moderate disease as follows: a peripheral oxygen saturation (SpO 2 ) ≥ 94% and respiratory rate < 30 breaths per minute (BPM), in the context of systemic symptoms (breathlessness or fever and chest pain, abdominal pain, diarrhea, or severe myalgia), recognizing that the threshold for hospitalization varies throughout a pandemic and that a sensitive cutoff for hypoxia would be desirable in a tool to inform community-based management [19,20].

Data Collection
Structured case-report forms (Supplementary Materials 2-10) were completed at enrolment, day 7, and day 14, and daily during admission to the study facilities. Anthropometrics and vital signs were measured at enrolment and demographics, clinical symptoms, comorbidities, and medication history collected via brief interview with the participant. Venous blood samples were collected at enrollment in ethylenediaminetetraacetic acid (EDTA) tubes. Participants were followed-up in-person when admitted to the facility and by telephone on days 7 and 14 if discharged prior to this. Those discharged who reported worsening symptoms on day 7 and/or persistent symptoms on day 14 were recalled to have their SpO 2 and respiratory rate measured.

Primary Outcome
The primary outcome was development of an oxygen requirement within 14 days of enrollment, defined as any of the following: SpO 2 < 94%; respiratory rate > 30 BPM; SpO 2 / FiO 2 < 400 [21,22]; or death, aligning closely with a WHO-CPS score of ≥ 5 [19]. Patients who received supplemental oxygen outside the study facilities were classified as meeting the primary outcome if it was not possible to retrieve their case notes, provided that the oxygen was prescribed in a licensed medical facility. The site study teams were unaware of which baseline variables had been preselected as candidate predictors when determining outcome status.

Candidate Predictors
We decided a priori that a model using 4 predictors would be practical in high-patient-throughput resource-limited settings. Considering resource constraints, reliability, validity, feasibility, and biological plausibility, we prespecified that each model would contain age, sex, SpO 2 , and 1 biochemical biomarker [10,17,23].
Clinical predictors were measured at enrolment and all biomarkers except NLR were measured retrospectively from samples obtained at enrollment. NLR was measured on site and was not repeated if it had been measured at the site within 24 hours prior to recruitment. All predictors were measured blinded to outcome status.

Sample Size
We considered the sample size for model development and validation separately. We followed the recommendations of Riley et al and assumed a conservative R 2 Nagelkerke of 0.15 [31]. We anticipated that ~8% of participants would meet the primary endpoint and estimated that 44 outcome events would be required to derive a prediction model comprising four candidate predictors and minimize the risk of overfitting (events per parameter [EPP] = 11).
Given the uncertainty around deterioration rates amongst patients with moderate COVID-19 at the time of study inception, we prespecified an interim review after the first 100 participants were recruited. At this review, the proportion of participants meeting the primary endpoint was higher than anticipated (20% vs. 8%). At this higher prevalence, and using R 2 values from 0.20 to 0.15, between 52 and 68 outcome events (EPP = 13-17) would be required to develop the prediction models [31]. Recognizing that (i) our range of R 2 estimates was conservative, (ii) penalized regression methods would reduce the risk of overfitting, and (iii) the external validation cohort would allow assessment of model optimism, and following the advice of the External Advisory Panel, a decision was made to use the first 50 outcome events to derive the models. Participants recruited after that point were entered into the external temporal validation cohort.

Model Development and Validation
We explored the relationship between candidate predictors and the primary outcome using a Lowess smoothing approach to identify nonlinear patterns. Transformations were used when serious violations of linearity were detected. We used penalized logistic (ridge) regression to develop the models and shrink regression coefficients to minimize model optimism. All predictors were prespecified, and no predictor selection was performed during model development. Due to few missing data (< 3% for any single predictor), missing observations were replaced with the median value, grouped by outcome status. A sensitivity analysis was conducted using full-case analysis.
We assessed discrimination (c-statistics) and calibration (calibration plots and slopes) for each model in the validation cohort, and examined classifications (true positives [TP], false positives [FP], true negatives [TN], false negatives [FN]) at clinically relevant cut-points (predicted probabilities). Finally, recognizing that the relative value of a TP and FP will vary at different stages of the pandemic [20], we examined the potential clinical utility of the models using decision curve analyses to quantify the net benefit between correctly identified TP or TN and incorrectly identified FP or FN at a range of plausible trade-offs (threshold probabilities) [32].
All analyses were done in R v4.03.

Ethical Approvals
This received oxygen via a face mask and/or nasal cannula (1 outside the study facilities), and 12 had an SpO 2 < 94% but did not receive oxygen supplementation (Supplementary Table 4; Supplementary Figure 3). Relationships between candidate predictors and the primary outcome are illustrated (Supplementary Figure 4), and c-statistics (continuous predictors) and odds ratios (continuous and categorical predictors) reported (Supplementary Table 5). The full models are presented in the Supplementary Materials (Supplementary Table 5; Supplementary Figure 5). After adjustment for the 3 clinical variables, 5 biomarkers (CRP, D-dimer, IL-6, NLR, and suPAR) were independently associated with development of an oxygen requirement.

Prognostic Models
Discrimination and calibration of each model in the validation cohort are presented in Figure 3. C-statistics ranged from 0.66 (clinical model and model containing PCT) to 0.74 (model containing IL-6). Calibration slopes ranged from 0.62 (model containing PCT) to 1.01 (model containing suPAR). Calibration was better at lower predicted probabilities, with some models overestimating risk at higher predicted probabilities.
The ability of each model to rule out progression to oxygen requirement amongst patients with moderate COVID-19 at predicted probabilities (cutoffs) of 10%, 15%, and 20% is shown (  Figure 6). A cutoff of 10% reflects a management strategy equivalent to admitting any patient in whom the predicted risk of developing an oxygen requirement is ≥ 10%. At this cutoff, the results suggest that a model containing the three clinical parameters (age, sex, and SpO 2 ) without any biomarkers could facilitate correctly sending home ~25% of patients with moderate COVID-19 who would not subsequently require supplemental oxygen, at the cost of also sending home ~9% of moderate patients who would deteriorate and require supplemental oxygen, that is, a ratio of correctly to incorrectly discharged patients of 10:1.
The inclusion of either NLR or suPAR improved the predictive performance such that the ratio of correctly to incorrectly discharged patients increased to 23:1 or 25:1 respectively, whilst a model containing IL-6 resulted in a similar proportion (~21%) of correctly discharged patients as the clinical model but without missing any patients who would deteriorate and require supplemental oxygen. Inclusion of the other candidate biomarkers (CRP, D-dimer, PCT, or sTREM-1) did not improve the ability of the clinical model to rule out progression to supplemental oxygen requirement.

Generalizability
We recognized that the relative value of a TP and FP, that is, admitted patients who would and would not subsequently require supplemental oxygen, was not fixed and would vary at different stages of the pandemic, reflecting bed pressures and/or capacity for follow-up [20]. Decision curve analyses accounting for this differential weighting suggest that the clinical model could provide utility (net benefit over an "admit-all" approach) at a threshold probability above 15% (ie, when the value of 1 TP is equal to ~7 FPs). Furthermore, the results indicate that models containing any 1 of IL-6, NLR, or suPAR could offer greater net benefit than the clinical model and extend the range of contexts in which a model might provide utility to include threshold probabilities above 5% (value of 1 TP is equal to 19 FPs; ie, when bed pressures are less critical). For the model containing IL-6, this higher net benefit appeared to be maintained across a range of plausible threshold probabilities (Figure 4).

DISCUSSION
We report the development and temporal validation of 3 promising clinical prediction models to assist with the assessment of patients with moderate COVID-19. The models combine 3 simple parameters (age, sex, and SpO 2 ) with measurement of a single biochemical biomarker (IL-6, NLR, or suPAR), quantifiable using commercially available rapid tests.
We included patients in whom there is clinical uncertainty as to whether admission is warranted, and adopted an analytical approach which acknowledged that the trade-offs inherent in this decision will vary at different stages of the pandemic and in different healthcare settings. We used specific systemic symptoms to define moderate severity disease rather than the WHO-CPS, recognizing, as did the scale's original authors, that the lower end of the WHO-CPS is subjective [19]. Performance of any prediction model is sensitive to the prevalence of the outcome it aims to predict, and thus we hope our more objective study entry criteria will better standardize the outcome prevalence and facilitate model transportability; we followed the widely used ISARIC case report form to define symptoms to permit validation by other groups [33].
Our approach focused on quantifying the added value of host biomarkers. We recognize that laboratory tests carry an opportunity cost, especially when resources are limited. Although a model containing clinical parameters alone would be simpler to implement, our analyses indicate that inclusion of 1 biomarker test would allow use of the model in a broader range of contexts, including when bed pressures are less acute early in a COVID-19 surge.
Our models have face validity. All clinical and laboratory predictors have been implicated in the pathogenesis of COVID-19 [10,17,23,25,27,29] Similar to others, we found that age and sex were not strongly associated with risk of deterioration, in contrast to their well-recognized association with COVID-19 mortality [23]. This underlines the importance of developing models for specific clinical use cases. Models developed to predict mortality are not necessarily appropriate to rule out less severe disease, just as models developed in well-resourced healthcare systems may not generalize to resource-limited settings [34].
The 3 biochemical biomarkers that demonstrate most promise in our study have biological plausibility. In addition to being a therapeutic target [35], raised IL-6 levels predict development of an oxygen requirement [27,28] and, along with an elevated NLR, form part of the COVID-19-associated hyperinflammatory syndrome (cHIS) diagnostic criteria [36]. Elevated suPAR levels are associated with disease severity and progression in both moderate and severe COVID-19 [29,37] and have been used for stratification into trials of immunomodulatory agents [38].
We addressed the limitations identified in other COVID-19 prognostic models by following the TRIPOD guidelines [18], and using a prospectively collected data set with minimal loss to follow-up and missing data [8]. Nevertheless, the small validation cohort (determined by the natural history of the pandemic in India) limits our ability to draw strong conclusions. Although the same models appeared superior in the different analyses we performed, further external validation is required before they can be recommended for use; we have published our full models (Supplementary Table 5; Supplementary Figure  5) to encourage independent validation.
No vaccinated individuals were included in the study. The models may require recalibration for use in vaccinated populations with lower baseline risk of progression to severe COVID-19. However, it is important to note that only 15/54 African countries met the WHO target of vaccinating 10% of their population by the end of September 2021 [39]. An estimated  55-70% vaccination coverage is required to achieve herd immunity for a vaccine with 90% efficacy [40]. Unfortunately, the timelines for adequate vaccination coverage in many LMICs are likely to be long. In our context, corticosteroids were readily available and often self-prescribed or used off-license. Although steroid use was associated with some candidate predictors, it was not associated with the primary outcome and is therefore unlikely to have confounded the observed association (Supplementary  Tables 7-8).
We selected oxygen requirement as our primary outcome as this reflects a clinically meaningful endpoint. We opted to use an SpO 2 /FiO 2 < 400 for participants without documented hypoxia or tachypnoea prior to initiation of supplemental oxygen, as the threshold for oxygen therapy can be subjective and vary depending on available resources [19,22]. It is unlikely that for that particular model; blue rug plots indicate distribution of predicted risk for participants who did (top) and did not (bottom) meet the primary outcome; grey shaded rectangle indicates region within which no individual participant's predicted risk falls for that particular model. C-statistics indicate how well participants who met the primary outcome are differentiated from those who did not; perfect discrimination is indicated by a c-statistic of 1.0. Calibration slopes indicate agreement between predicted probabilities and observed outcomes; perfect calibration is indicated by a slope of 1.0. Abbreviations: CRP, C-reactive protein; IL-6, interleukin 6; NLR, neutrophil-to-lymphocyte ratio; PCT, procalcitonin; sTREM-1, soluble triggering receptor expressed on myeloid cell-1; suPAR, soluble urokinase plasminogen activator receptor. our outcome lacked sensitivity; only 1 participant who received supplemental oxygen did not meet the primary outcome. It may have lacked specificity (12 participants who met the primary outcome did not receive supplemental oxygen and calculation of FiO 2 in nonventilated patients can overestimate pulmonary dysfunction) [41], but sensitivity would always be prioritized in a tool to inform community-based management. Furthermore, any outcome misclassification is likely to have reduced, rather than exaggerated, the prognostic performance of the candidate predictors and models [42].
Baseline Ct value was not associated with the risk of deterioration (Supplementary Table 9). In keeping with others, we found that seronegativity at enrollment was associated with an increased risk of deterioration (49/190 [25.8%] vs. 37/222 [16.7%]; χ 2 = 5.16; P = .023) [43,44]. As rapid antibody tests are available this warrants further exploration, acknowledging that this is likely most relevant in patients without a history of previous COVID-19 illness or vaccination.
In conclusion, we present 3 clinical prediction models that could help clinicians to identify patients with moderate COVID-19 who are suitable for community-based management. The models address an unmet need in the COVID-19 care continuum. They are of particular relevance where resources are scarce and, if validated, would be practical for implementation. Routinely collected data from MSF medical facilities across 26 LMICs indicate that 54.4% (18 400/33 780) of patients presenting with clinically suspected COVID-19 between March 2020 and November 2021 who might be considered for admission, or 16.2% of all patients (18 400/113 455), would have been eligible for assessment using our models, illustrating the potential for widespread impact.