COVID‐19 outcomes in patients with cancer: Findings from the University of California health system database

Abstract Background The interaction between cancer diagnoses and COVID‐19 infection and outcomes is unclear. We leveraged a state‐wide, multi‐institutional database to assess cancer‐related risk factors for poor COVID‐19 outcomes. Methods We conducted a retrospective cohort study using the University of California Health COVID Research Dataset, which includes electronic health data of patients tested for severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) at 17 California medical centers. We identified adults tested for SARS‐CoV‐2 from 2/1/2020–12/31/2020 and selected a cohort of patients with cancer. We obtained demographic, clinical, cancer type, and antineoplastic therapy data. The primary outcome was hospitalization within 30d after the first positive SARS‐CoV‐2 test. Secondary outcomes were SARS‐CoV‐2 positivity and severe COVID‐19 (intensive care, mechanical ventilation, or death within 30d after the first positive test). We used multivariable logistic regression to identify cancer‐related factors associated with outcomes. Results We identified 409,462 patients undergoing SARS‐CoV‐2 testing. Of 49,918 patients with cancer, 1781 (3.6%) tested positive. Patients with cancer were less likely to test positive (RR 0.70, 95% CI: 0.67–0.74, p < 0.001). Among the 1781 SARS‐CoV‐2‐positive patients with cancer, BCR/ABL‐negative myeloproliferative neoplasms (RR 2.15, 95% CI: 1.25–3.41, p = 0.007), venetoclax (RR 2.96, 95% CI: 1.14–5.66, p = 0.028), and methotrexate (RR 2.72, 95% CI: 1.10–5.19, p = 0.032) were associated with greater hospitalization risk. Cancer and therapy types were not associated with severe COVID‐19. Conclusions In this large, diverse cohort, cancer was associated with a decreased risk of SARS‐CoV‐2 positivity. Patients with BCR/ABL‐negative myeloproliferative neoplasm or receiving methotrexate or venetoclax may be at increased risk of hospitalization following SARS‐CoV‐2 infection. Mechanistic and comparative studies are needed to validate findings.


| INTRODUCTION
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has caused over 5.5 million deaths worldwide. 1 The relationship between cancer and SARS-CoV-2 infections and outcomes remains a topic of ongoing controversy, and of importance to both oncologists and their patients.
Thus far, most studies investigating these relationships have been limited in size or detail. Though several studies have identified associations between SARS-CoV-2 infection or outcomes and some cancer diagnoses, such as lung and hematologic malignancies, 2-10 large cohort studies of patients with cancer and COVID-19 have grouped malignancies or excluded certain cancer types, such as myeloproliferative neoplasms. 4,7,10 Since there is heterogeneity in the biology and treatment of cancer based on specific cancer histologies, outcomes following SARS-CoV-2 infection may be differentially impacted by cancer type. Deeper investigation of outcomes based on cancer type is needed.
Moreover, it remains uncertain whether recent antineoplastic systemic therapy use, particularly cytotoxic chemotherapy, is a risk factor for poor COVID-19 outcomes. A recent meta-analysis of 26 cohort studies found no adverse effect of various anti-cancer therapies on COVID-19 severity or mortality, 11 perhaps due to the categorization of heterogeneous groups of therapies. Detailed examination of outcomes based on specific therapies is needed. Clarifying these cancer-related risk factors is useful in counseling and management of patients with cancer and COVID-19.
Informatics-based analyses of large, real-world data sets may facilitate an understanding of the relationships between these potential risk factors and COVID-19 outcomes. We leveraged the University of California Health COVID Research Data Set (UC CORDS), 12 which aggregates the electronic health records data of all patients who underwent testing for SARS-CoV-2 at University of California (UC)-affiliated hospitals. We hypothesized that understudied cancer types, for example, hematologic cancer subtypes and specific systemic therapies, such as lymphocyte-depleting therapies, are associated with higher hospitalization, intensive care use, and death following COVID-19.

| METHODS
We conducted a retrospective cohort study of patients using UC CORDS v2.0. 12 This limited data set includes prospectively-collected electronic health data of all patients who underwent quantitative reverse transcriptionpolymerase chain reaction (RT-qPCR) testing for SARS-CoV-2 at 5 UC academic medical centers (Davis, Irvine, Los Angeles, San Diego, and San Francisco) and 12 affiliated California hospitals. UC CORDS is organized using the Observational Medical Outcome Partnership common data model, which contains diagnoses, medications, labs, and procedures associated with clinical encounters. Data are refreshed on a weekly basis. The study protocol was reviewed and approved by both UCSF and Lawrence Livermore National Laboratory institutional review boards.

Conclusions:
In this large, diverse cohort, cancer was associated with a decreased risk of SARS-CoV-2 positivity. Patients with BCR/ABL-negative myeloproliferative neoplasm or receiving methotrexate or venetoclax may be at increased risk of hospitalization following SARS-CoV-2 infection. Mechanistic and comparative studies are needed to validate findings.

K E Y W O R D S
cancer, COVID-19, myeloproliferative neoplasm, outcomes research within 1 year prior to the test date, that is, "index date." For patients with a positive SARS-CoV-2 RT-qPCR test, the index date was the first positive date; otherwise, the index date was the first negative test date. These criteria were intended to select patients being routinely followed for their cancer. Patients with only basal and squamous cell cutaneous cancers were excluded given the extremely low morbidity and mortality of these cancers. Patients with other/unknown gender were excluded, since none of these patients had both cancer and COVID-19. For analysis of severe COVID-19 (defined below), a third cohort of cancer patients with COVID-19 who were hospitalized within 30 days of index date was created to enrich for laboratory data and likelihood that outcomes are attributable to COVID-19. Figure S1 shows a flow diagram of these three cohorts.

| Independent variables
For demographic variables, we included birth year, gender, race, and ethnicity. For clinical variables, we included cancer type; comorbidities that are known to be associated with COVID-19 severity in patients with cancer (coronary artery disease, congestive heart failure, chronic kidney disease, chronic obstructive pulmonary disease, and asthma within 1 year prior to index date); and body mass index. 13 Primary cancer types were granularly categorized as hematologic (acute leukemia, chronic lymphocytic leukemia/small lymphocytic lymphoma, chronic myeloid leukemia, lymphoma, myelodysplastic syndrome, BCR/ABL-negative myeloproliferative neoplasm, plasma cell dyscrasia, and other); solid (breast, gastrointestinal, germ cell, gynecological, head and neck, hepatobiliary/ pancreatic, lung, melanoma, nervous system, neuroendocrine/endocrine, prostate, sarcoma, urinary tract, and other); multiple cancer types; and unspecified cancer type (Table S1). Antineoplastic systemic therapy use from 60 days prior to index date to 30 days after was included. Antineoplastic systemic therapies were categorized by adapting the National Library of Medicine RxClass into antibody, chemotherapy, hormone therapy, immunebased therapy (includes agents that stimulate or modulate the immune system), tyrosine kinase inhibitor, other cytotoxic therapy, and other targeted therapy (Table S2). 14 We also included laboratory data from 60 days prior to the index date to 30 days afterward.

| Dependent variables
Outcomes included SARS-CoV-2 positivity (at least 1 positive RT-qPCR test); hospitalization within 30 days after the index date; and a composite endpoint for severe COVID-19, defined as either intensive care unit admission, need for mechanical ventilation, or death within 30 days after the index date.

| Statistical analysis
We first calculated the incidence of SARS-CoV-2 test positivity among all patients tested for SARS-CoV-2, regardless of cancer. Then, we conducted a series of multivariable logistic regression models for each of the three cohorts described above (all COVID-19-positive patients, COVID-19-positive patients with cancer, and hospitalized COVID-19-positive with cancer). To predict the risk of SARS-CoV-2 test positivity in the overall cohort, we created a multivariable model that included age, gender, race, ethnicity, comorbidities, any cancer history, and receipt of any systemic therapy. Laboratory tests were not included given the high rate of missing values. To identify cancer-related risk factors for hospitalization in the cohort of patients with cancer and COVID-19 (the primary outcome), we created two multivariable models: one in which systemic therapies were categorized and another in which individual therapies were delineated. Both models contained specific cancer types compared to unspecified cancer type, selected since this category comprised of non-specific diagnostic codes that would likely span a range of cancer types (e.g., ICD-10 C79.51 secondary neoplasm of bone). Again, laboratory tests were not included in these models given the high rate of missing values. Lastly, to evaluate the risk of severe COVID-19, we used the cohort restricted to hospitalized patients with cancer and COVID-19. We created a multivariable model with systemic therapies as categories and laboratory tests as continuous variables, only including tests that at least 70% of patients had completed. We did not incorporate individual therapies in this analysis because few patients were associated with each therapy.
Logistic regressions were visualized using forest plots with adjusted relative risk ratios, and 95% confidence intervals (p < 0.05) were considered significant. Multiple imputations were used for imputation of missing laboratory values. 15 We did not correct for multiple comparisons given the exploratory nature of the study. 16 The logistic regression models were implemented using the statsmodels module in the Python programming language (v3.8). 17

| RESULTS
Overall, 24,177 of 409,462 (5.9%) patients undergoing SARS-CoV-2 RT-qPCR tested positive for SARS-CoV-2 at any time during the study period. Of the 49,918 patients with a history of cancer, 1781 (3.6%) tested positive. The mean age of SARS-CoV-2-positive patients with cancer was 59 years (SD = 16); 950 (53%) were female, 939 (53%) were White, and 636 (36%) were Hispanic or Latino (Table 1). The most common cancer types were Multiple (N = 293, 16%); Breast (N = 241, 14%); and Prostate (N = 122, 7%) per Table 1. Three hundred twenty-four (18%) patients were on active systemic therapy, of which chemotherapy was the most common (N = 153, 9%). Individual therapies are listed in Table S3. Figure 1 describes factors associated with SARS-CoV-2 test positivity in the entire cohort. In terms of cancerrelated factors, positive cancer history was associated with a decreased risk of a positive test (RR 0.70, 95% CI: 0.67-0.74, p < 0.001). Similarly, any systemic therapy use was associated with a decreased risk of a positive test (RR 0.77, 95% CI: 0.70-0.85, p < 0.001). In terms of specific cancer types, many were associated with a decreased risk or no difference in risk of a positive test compared to unspecified cancer types ( Figure S2).
In a post hoc sensitivity analysis of this model, we changed the time window of therapy receipt from 60 days before through 30 days after positive test date to 60 days before the positive test date. Methotrexate remained associated with an increased risk of hospitalization (RR 2.98, 95% CI: 1.06-5.86, p = 0.040), as did venetoclax (RR 2.97, 95% CI: 1.04-5.87, p = 0.042; data not shown).
With the hypothesis that COVID-19 severity for patients with myeloproliferative neoplasms varies based on abnormalities in thrombosis-related laboratory values, we conducted a post hoc analysis in which we added interaction terms between myeloproliferative neoplasms and platelet count and between myeloproliferative neoplasms and fibrinogen to the model. The interaction terms were not significant ( Figure S3).

| DISCUSSION
In this study, we used a state-wide multi-institution collection of electronic health record data comprising all patients undergoing SARS-CoV-2 testing in UC health systems. We identified cancer-related factors associated with adverse outcomes that had not previously been described.
Notably, we found a decreased risk of SARS-CoV-19 positivity in patients with cancer compared to those without cancer. This is counter to studies that report an increased risk of infection in patients with cancer, 2,7,18 and others that report no difference in risk. 19,20 The discrepancy between these studies and ours may be related to greater protective behaviors and testing practices in UC patients with cancer. For example, UC patients with cancer may be likely to employ behaviors that decrease transmission (e.g., social distancing and mask-wearing) or carry a lower threshold to undergo testing compared to patients with cancer in other regions. There is also likely selection bias, as UC patients with cancer may have undergone more tests than those without cancer due to mandated asymptomatic testing prior to infusions, radiation therapy, and surgeries, as had been instituted in some UC medical centers. A similar selection bias was found to potentially explain the association between allergy medication use and decreased SARS-CoV-2 infection. 21 We found that patients receiving systemic therapy were also less likely to test positive, perhaps for similar reasons. Therefore, these findings should be interpreted with caution. Future studies could examine patient behaviors and SARS-CoV-2 testing indication to better investigate this discrepancy.
With more granular investigation of the role of cancer types and therapies, we also identified previously  undescribed risk factors for hospitalization in patients with cancer. Patients with BCR/ABL-negative myeloproliferative neoplasms and COVID-19 were at an increased risk of hospitalization. Patients with myeloproliferative neoplasms are in a pro-inflammatory state, with qualitative and quantitative abnormalities in myeloid cells leading to both venous/arterial thrombosis and coagulopathy. [22][23][24] Similarly, COVID-19 severity is closely related to pro-inflammatory markers, 25 and is also associated with both thrombosis and coagulopathy. 26,27 Therefore, patients with myeloproliferative neoplasms may be particularly susceptible to worse COVID-19 outcomes. 28 To our knowledge, this finding has not been previously described in comparative studies, perhaps because these patients have been excluded or under-represented in cancer cohort studies. Some supportive evidence does exist. In a non-comparative study, Salisbury et al. 29 highlighted a high rate of adverse outcomes in patients with myeloproliferative neoplasms and COVID-19, especially upon ruxolitinib withdrawal. Other groups have reported a similar or decreased risk of mortality in patients with myeloproliferative neoplasms compared to those with other hematologic malignancies. [30][31][32] In our study, we did not find that hospitalized patients with myeloproliferative neoplasms had a higher risk of severe COVID-19, but this analysis is limited by the small size of this subgroup. Two antineoplastic medications were found to be associated with an increased risk of hospitalization. Venetoclax, a Bcl2 inhibitor, is commonly used in the treatment of chronic lymphocytic leukemia as a monotherapy, and in combination with other therapies for acute myeloid leukemia. Adverse outcomes have been previously described in patients receiving venetoclax for chronic lymphocytic leukemia in a non-comparative study, but not in any large, comparative study to our knowledge. 33 Its association with an increased risk of hospitalization may be related to the negative effect on immune function via lymphocyte depletion, a known risk factor for COVID-19 severity, 34 and via reduced interferon-alpha production and dendritic cells depletion; pneumonia is a known toxicity or venetoclax. 35 Another potential mechanism involves ACE-2 and bcl-2. Motaghinejad et al. 36 postulated that increased COVID-19 mortality is partially driven by decreased ACE2 expression in the pulmonary and cardiovascular systems, causing destabilization of Bcl-2 and  dysregulation of apoptosis. This dysregulation may be compounded for patients receiving venetoclax, a Bcl-2 inhibitor, leading to cardiopulmonary complications. Methotrexate was also associated with an increased risk of hospitalization. As cytotoxic chemotherapy, methotrexate may increase susceptibility to COVID-19 complications through immunosuppression. Though most studies suggest that chemotherapy, in general, is a risk factor for worse COVID-19 outcomes, several studies have not confirmed the association, likely due to the heterogeneity of different chemotherapy agents and regimens. 4,9,37 Methotrexate as a risk factor has not been studied in cancer patients, but findings for low-dose methotrexate in patients with rheumatologic conditions have been mixed. 38,39 Despite the biologic rationale for poor COVID-19 outcomes in patients receiving venetoclax or methotrexate, we did not find a consistent association across other outcomes. For example, there was no increased risk of severe COVID-19 in hospitalized patients receiving these therapies. Though this negative finding should be interpreted with caution given the small sample sizes, we cannot exclude the possibility that these observations are coincidental given the high number of individual therapies included in the models. Further confirmatory studies investigating these potential risk factors should be performed.
We also identified known risk factors for hospitalization following COVID-19. These risk factors included older age, Asian race, Hispanic or Latino ethnicity, coronary artery disease, chronic kidney disease, diabetes mellitus, and chronic obstructive pulmonary disease. Prior studies have demonstrated these associations. 5,8,[40][41][42] In our study, we also demonstrated the feasibility and utility of a prospectively created, frequently, and passively updated multi-center dataset. In the COVID-19 and future pandemics, it is critical to rapidly identify patient-related attributes and interventions that affect the risk of infection, morbidity, and mortality. The timely creation of frequently and passively updated data sources that contain patient-level clinical data from multiple health systems, like UC CORDS and N3C, is invaluable to this goal. These data could complement those of other consortium and registry efforts, such as the CCC19, which provide more granular data using human abstraction. 4 For example, Reznikov et al. 18 mined UC CORDS within a few months of its creation and identified antihistamines associated with decreased SARS-CoV-2 test positivity. In vitro drug susceptibility assays showed hypothesis-generating antiviral mechanisms of candidate antihistamines.
Central data sets for future pandemics could include other data types (e.g., imaging, patient symptoms, and genomic data to identify mechanisms and therapeutic F I G U R E 1 In a multivariable logistic regression of all 409,462 adult patients who underwent SARS-CoV-2 testing, race, ethnicity, and comorbidities were associated with an increased risk of a positive test. History of cancer and antineoplastic systemic therapy were associated with a decreased risk of a positive test. Adjusted relative risk ratios are shown. CAD, coronary artery disease; CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease. * Denotes p < 0.05 targets); real-time data collection (e.g., patient location, activity, and vital data using wearable devices); and data from other health care systems (e.g., Veterans Health Administration and Community Health Centers). Pandemic preparation should also include identification of and resource-allocation to teams to harmonize and standardize raw data, and to query, analyze, and interpret databases. Efforts must be done ethically with potential biases in mind. 43 Moreover, advanced analytic techniques, such as artificial intelligence (AI) approaches, that are findable, accessible, interoperable, and reusable to facilitate the development of new AI applications, could be applied.
This study has several strengths, including one of the largest cancer cohorts to date, a diverse cohort, use of a novel database, and more granular categorization of cancer types and therapies. There are several limitations. F I G U R E 2 Adjusted risk of 30-day hospitalization following a positive SARS-CoV-2 test among 1781 adult cancer patients. In a multivariable logistic regression of 1781 adult patients with a history of cancer and a positive SARS-CoV-2 test, Asian race (compared to White race), Hispanic/Latino ethnicity (compared to not Hispanic/Latino ethnicity), multiple comorbidities, myeloproliferative neoplasm (compared to unspecified cancer type), and Other targeted therapy (compared to not) were associated with an increased risk of 30-day hospitalization. Adjusted relative risk ratios are shown. CAD, coronary artery disease; CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease; TKI, tyrosine kinase inhibitor. Chronic Myelogenous Leukemia (N = 11) was combined with Cancer-Hematologic, other given few patients. * Denotes p < 0.05 Selection bias may have been introduced from the inclusion of only patients who underwent SARS-CoV-2 testing and from over-representation of patients cared for at academic centers, where asymptomatic testing and COVID-19 treatment practices may have differed from other hospitals. The database does not contain certain risk factors for severe COVID-19, including cancer stage, cancer remission status, smoking status, poor performance status, and socioeconomic variables. Due to the limited sample size of hospitalized patients, we could not evaluate whether venetoclax and methotrexate were associated with severe COVID-19 infection. Multiple testing was not accounted for. Lastly, we could not ascertain the outcomes of patients who sought care outside the UC health system.

| CONCLUSION
As the COVID-19 pandemic continues and new, highly transmissible variants emerge, it is important to remain vigilant of risk factors for severe infection. Close attention to patients with risk factors will allow us to better prevent F I G U R E 3 This multivariable logistic regression is identical to that in Figure 2, except that individual systemic therapies are delineated. Methotrexate (categorized as chemotherapy) and venetoclax (categorized as Other targeted therapy) were associated with an increased risk of 30-day hospitalization. Adjusted relative risk ratios are shown. ADT, androgen deprivation therapy; CAD, coronary artery disease; CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease. Exemestane (N = 11) was combined with Med-Letrozole given few patients. Med-Other includes Dasatinib (N=7), Ruxolitinib (N = 7), and Sirolimus (N = 8). * Denotes p < 0.05 and monitor COVID-19 in high-risk patients. We found that patients with COVID-19 and myeloproliferative neoplasms, and those receiving methotrexate or venetoclax, may be at an increased risk of hospitalization. Further studies to confirm these associations are needed, as are studies to understand underlying mechanisms. Investigation is also needed to explain and confirm the lower risk of test positivity in patients with cancer than those without cancer. Lastly, policy makers and health systems should focus on establishing timely, live central databases of electronic health data to provide rapidly accumulating data for future pandemic preparedness, as well as the human capital needed for their maintenance and use.

ACKNOWLEDGMENTS
We thank Dr. Sharat Israni and Dr. Atul Butte for their support. In addition, this document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for F I G U R E 4 In a multivariable logistic regression of 388 adult patients with a history of cancer who were hospitalized within 30 days after a positive SARS-CoV-2 test, no factors were associated with an increased risk of severe COVID-19. Severe COVID-19 was defined as intensive care admission, receipt of mechanical ventilation, or death within 30 days of first positive test. Adjusted relative risk ratios are shown. CAD, coronary artery disease; CHF, congestive heart failure. COPD, chronic obstructive pulmonary disease. TKI, tyrosine kinase inhibitor. * Denotes p < 0.05 the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.