Neurophenotypes of COVID-19: risk factors and recovery outcomes

Coronavirus disease 2019 (COVID-19) infection is associated with risk of persistent neurocognitive and neuropsychiatric complications, termed “long COVID”. It is unclear whether the neuropsychological manifestations of COVID-19 present as a uniform syndrome or as distinct neurophenotypes with differing risk factors and recovery outcomes. We examined post-acute neuropsychological profiles following SARS-CoV-2 infection in 205 patients recruited from inpatient and outpatient populations, using an unsupervised machine learning cluster analysis, with objective and subjective measures as input features. This resulted in three distinct post-COVID clusters. In the largest cluster (69%), cognitive functions were within normal limits, although mild subjective attention and memory complaints were reported. Vaccination was associated with membership in this “normal cognition” phenotype. Cognitive impairment was present in the remaining 31% of the sample but clustered into two differentially impaired groups. In 16% of participants, memory deficits, slowed processing speed, and fatigue were predominant. Risk factors for membership in the “memory-speed impaired” neurophenotype included anosmia and more severe COVID-19 infection. In the remaining 15% of participants, executive dysfunction was predominant. Risk factors for membership in this milder “dysexecutive” neurophenotype included disease-nonspecific factors such as neighborhood deprivation and obesity. Recovery outcomes at 6-month follow-up differed across neurophenotypes, with the normal cognition group showing improvement in verbal memory and psychomotor speed, the dysexecutive group showing improvement in cognitive flexibility, and the memory-speed impaired group showing no objective improvement and relatively worse functional outcomes compared to the other two clusters. These results indicate that there are multiple post-acute neurophenotypes of long COVID, with different etiological pathways and recovery outcomes. This information may inform phenotype-specific approaches to treatment.


Introduction
Cognitive and psychiatric symptoms are among the most common, persistent, and disabling consequences of COVID-19 (Taquet et al., 2022;Hastie et al., 2022;Davis et al., 2021). Post-COVID cognitive impairment is reported in 15-35% of patients during the chronic "long COVID" recovery phase (Ceban et al., 2022;Becker et al., 2021), with higher rates reported in patients who were hospitalized versus home-isolated during the acute stage of illness (Pihlaja et al., 2023). Self-reported cognitive complaints include problems with concentration, memory, and slowed thinking (Davis et al., 2021;Ceban et al., 2022), commonly referred to as "brain fog". Objective neuropsychological assessment shows predominant impairment in attention, executive functioning, and memory, with relative preservation of language and visuospatial functions (Bertuccelli et al., 2022). The most common post-COVID neuropsychiatric manifestations (new onset and chronic) include fatigue, anxiety, depression, insomnia, and posttraumatic stress disorder (PTSD) (Nalbandian et al., 2021). At this point, it remains unclear whether these heterogenous cognitive and psychiatric sequelae are uniformly elevated or clustered into distinct post-COVID neurophenotypes. This is important to determine, as different post-acute neurophenotypic presentations could point backwards to different etiological factors and forwards to different recovery outcomes.
Several mechanisms may contribute to persistent cognitive and psychiatric sequelae following  infection. These include neuroinvasion of SARS-CoV-2 into brain or neuroepithelial tissue and indirect damage from respiratory failure (hypoxic-ischemic effects), stroke, multi-organ system dysfunction, in ammasome activation, and complex pandemic-related psychosocial factors, such as social isolation, Recognizing this, we examined objective and subjective neuropsychological outcomes of SARS-CoV-2 infection in ambulatory and hospitalized patients using standardized assessment procedures across successive COVID-19 variant waves. We characterized multivariate neuropsychological clusters (i.e., "neurophenotypes") during the post-acute recovery stage, identi ed their associative features, and reexamined recovery outcomes six months later with the same symptom inventories and computerized cognitive testing platform. We hypothesized that neurocognitive and neuropsychiatric sequelae of COVID-19 would cluster into distinct neurophenotypes, each related to unique disease-speci c and non-speci c factors, and each differing in their longitudinal recovery trajectories.

Participants
The Mayo Clinic Institutional Review Board approved this prospective longitudinal cohort study. We electronically obtained informed consent from all participants, recruited between July 2020 and February 2022 from a hospital-wide registry of Mayo Clinic Florida patients who tested positive for SARS-CoV-2 infection. All participants were ≥18 years of age and had no history of major neurocognitive disorders. Participants completed the outcome assessment within 12 weeks of PCR-con rmed infection (post-acute recovery stage) and follow-up assessments six months later (chronic recovery stage). Participation required access to a computer for consent, test, and survey completion. All participants received a digital link to complete assessments and respond to questionnaires.
We abstracted participant demographics and medical history from the electronic health record (EHR) at time of initial post-acute outcome assessment. Medical comorbidities were summarized with the Elixhauser Van-Walraven Index (EVCI). The EVCI includes 31 common chronic medical conditions and was calculated from EHR data up to 1 year before the PCR-positive test date (van Walraven et al., 2009). Higher scores indicate a greater number of medical comorbidities. Vascular risk factors (history of smoking, diabetes, hypertension, and obesity) were identi ed by ICD-10 diagnostic codes in the patient's EHR (Kloppenborg et al., 2008) and separately summarized due to the specialized role that vascular risk factors play in adverse COVID-19 outcomes (Taquet et al., 2021). Sociodemographic disadvantage was summarized with the 2019 Area Deprivation Index (ADI). ADI scores for each participant were retrieved from the University of Wisconsin-Madison's Neighborhood Atlas (2019), which derives national percentile rankings of socioeconomic disadvantage at the (US) Census Block Group neighborhood level from 1 (least disadvantaged) to 100 (most disadvantaged) based on unemployment rates, poverty, education, and housing (Kind and Buckingham, 2018).
COVID-19 disease severity was determined by an infectious disease specialist (HRP), using the National Institute of Allergy and Infection Disease Ordinal Scale (NIAID-OS) (Dodd et al., 2020), with lower scores indicating higher illness severity. The following COVID-19 disease-speci c factors were assessed: hospitalization status (ambulatory versus hospitalized), symptom status (symptomatic versus asymptomatic), and presence of anosmia (yes/no). Vaccination status was coded at three levels: vaccine not available (prior to FDA approval); unvaccinated (vaccine FDA approved but participant remained unvaccinated); and vaccinated. Finally, COVID-19 variant type was estimated from peak variant prevalence data at covariants.org (Hodcroft, 2021) by test region (Florida) and time span (binned in 2week intervals).

Neurocognitive Assessment
We assessed objective cognitive performance with the CNS-Vital Signs (CNSVS) computerized neurocognitive assessment during the post-acute and chronic stages of recovery. The CNSVS includes the following neurocognitive domains: (a) verbal memory (immediate and delayed word recognition); (b) visual memory (immediate and delayed design recognition); (c) psychomotor speed ( nger tapping and symbol digit coding tests); (d) reaction time (averaged across Stroop congruent and incongruent trials); (e) complex attention (sum of errors from continuous performance, shifting attention, and Stroop tests); and (f) cognitive exibility (correct responses on the shifting attention test minus the number of errors on the shifting attention test and Stroop test), as previously described (Gualtieri and Johnson, 2006). Domain scores were age-adjusted by comparison to a normative reference group (mean = 100, standard deviation = 15), which was collected by the test publisher prior to the COVID-pandemic (Gualtieri and Johnson, 2006). Classi cation of impairment (< 9th percentile) was based on the American Academy of Clinical Neuropsychology consensus conference statement on uniform labeling of performance test scores (Guilmette et al., 2020). CNSVS includes embedded validity indicators, which show overall high accuracy in identifying intentional attempts to underperform (Anderson et al., 2020). Scores that were agged as invalid were removed before analyses.

Data Imputation
Multivariate Imputation via Chained Equations (MICE) with predictive mean matching for ve imputations and 50 iterations (van Buuren and Groothuis-Oudshoorn, 2011) was used to complete missing data from the post-acute neurocognitive assessment and neuropsychiatric symptom inventories, with a <15% missingness threshold (Jakobsen et al., 2017). Little's test statistic was used to assess whether data was missing completely at random (MCAR) between cognitive domains (Little, 1988). Pearson χ2 and analysis of variance (ANOVA) statistical tests for parametric data and Kruskal-Wallis tests for nonparametric data were used to assess clinical and sociodemographic factors associated with cognitive domain data completeness.

Unsupervised Machine Learning: K-means Clustering
Unsupervised machine learning methods were used to perform cluster analyses, given that they allow for the inference of subgroups (referred to as clusters) within a dataset. Algorithms in unsupervised learning strive to maximize inter-cluster separation and minimize separation among samples within a cluster.
Objective and subjective neuropsychological measures collected during the post-acute illness stage were used as input features in a K-means clustering analysis. No domain bias was applied to inputs. We determined the optimal number of k-means clusters with the elbow method (Kaufman, 1990 Clinical and sociodemographic factors associated with cluster membership were identi ed through Pearson χ2 and analysis of variance (ANOVA) statistical tests for parametric data and Kruskal-Wallis tests for non-parametric data. Normality was determined by kurtosis, skewness, and Shapiro-Wilk tests. Friedman non-parametric tests were used to evaluate longitudinal change in cognitive test scores and symptom inventory scores within each cluster. Analysis of variance (ANOVA) statistical tests were used to compare functional (MOS-36 scores) and psychiatric (PTSD scores) outcomes between clusters at the chronic (6-month) recovery stage. Signi cance was set at p < 0.05. Post-hoc analysis was conducted using Bonferroni correction to counteract Type I errors. Signi cance was set at p < 0.05.

Missing Data Analyses
The Little's test result (test-statistic (29) = 20.97, p = 0.86) indicated that data was MCAR between cognitive domains. There were no differences associated with patient age, sex, education, ADI, hospitalization status, or NIAID score (p > 0.05) in cognitive domain data completeness. All input features met a <14% missingness threshold before MICE imputation.

Post-Acute Neuropsychological Outcomes K-means Cluster Analysis
We determined with the elbow method that the optimal number of clusters was k = 3. Clusters did not differ in data missingness (χ2 = 3.43, p = 0.22). Cluster 1 (N = 31) was characterized as "dysexecutive" due to impaired cluster centers for cognitive exibility and complex attention ( Figure 1, Supplementary Table 2). This dysexecutive cluster was also characterized by mild-to-moderate complaint severity for anxiety, attention, memory, fatigue, and pain. Cluster 2 (N = 32) was characterized as "memory-speed impaired" due to impaired cluster centers for verbal memory, psychomotor speed, and reaction time, as well as low average cluster centers for visual memory and cognitive exibility. This memory-speed impaired cluster was also characterized by mild complaint severity for memory, attention, anxiety, depression, and pain, as well as moderate-severe fatigue. Cluster 3 was the largest cluster (N = 142) and was characterized as "normal cognition" due to cluster centers in the average/normal range for all cognitive domains. Notably, despite normal objective cognitive performance, participants in this cluster still reported mild complaint severity for attention, memory, fatigue, and pain.
To facilitate comparison of cognitive impairment rates with other studies reported in the literature, we calculated the rates of subjective and objective cognitive impairment by cluster and by domain in the post-acute recovery stage (Supplementary Table 3).

Disease-speci c Risk Factors for Cluster Membership
There were no signi cant associations between cluster membership and COVID-19 variant type (χ2 = 5.56, p = 0.24) or symptom status (χ2 = 0.42, p = 0.81). There was a marginal relationship between cluster membership and hospitalization status (χ2 = 5.9, p = 0.05), with the highest hospitalization rates in the memory-speed impaired cluster (31%). Cluster membership was associated with vaccination status (χ2 = 11.64, p = 0.02); the normal cognition cluster had the highest vaccination rate (51%), while the memory-speed impaired cluster had the lowest vaccination rate (22%). Lack of vaccine availability at time of infection (i.e., infection before 12/17/20) was the most common reason why patients were unvaccinated across all 3 clusters. There was a strong association between cluster membership and anosmia (χ2 = 12.02, p = 0.002), as well as NIAID scores (H (2) = 10.20, p = 0.006), with the memoryspeed impaired cluster showing the highest rate of anosmia (70%) and lowest median NIAID scores (higher disease severity). Full results are presented in Table 1. Hospitalized, on invasive mechanical ventilation or extracorporeal membrane oxygenation (ECMO); 6) Hospitalized, on non-invasive ventilation or high flow oxygen devices; 5) Hospitalized, requiring supplemental oxygen; 4) Hospitalized, not requiring supplemental oxygen -requiring ongoing medical care (COVID-19 related or otherwise); 3) Hospitalized, not requiring supplemental oxygen -no longer requires ongoing medical care; 2) Not hospitalized, limitation on activities and/or requiring home oxygen; 1) Not hospitalized, no limitations on activities. Area Deprivation Index (ADI): Socioeconomic disadvantage at the (US) Census Block Group neighborhood level ranging from 1 (least disadvantaged) to 100 (most disadvantaged) based on unemployment rates, poverty, education, and housing. Elixhauser van-Walraven Index (EVCI): Summary index of 31 common chronic medical conditions abstracted from the electronic health record up to 1 year prior to PCR-positive test date. Cluster columns not sharing subscripts indicate mean or median differs significantly at p < 0.05 as indicated by Bonferroni correction.

Longitudinal Recovery Outcomes
Of the original 205 participants who completed assessments in the post-acute recovery stage, 101 (49%) completed a follow-up assessment in the 6-month chronic recovery stage (dysexecutive cluster 1: N=11; memory-speed impaired cluster 2: N=12; normal cognition cluster 3: N=78). To facilitate comparison of chronic cognitive impairment rates with other studies reported in the literature, we calculated the rate of subjective and objective cognitive impairment by cluster and by domain in the chronic 6-month recovery stage (Supplementary Table 4). We compared impairment rates across clusters using chi-square analyses. Results show that cluster 1 (dysexecutive neurophenotype) no longer differs in rates of complex attention or cognitive exibility impairment relative to the other 2 clusters at the 6-month recovery stage. The rate of verbal memory impairment rate in cluster 2 (memory-speed impaired neurophenotype) does not differ from the other clusters. However, cluster 2 does demonstrate higher rates of objective visual memory and psychomotor speed impairment at the 6-month recovery stage, as well as self-reports a higher rate of subjective memory impairment.
To examine within-subject change in objective performance and subjective symptoms over time, we used non-parametric Friedman test of differences among repeated measures. Results for each cluster are provided in Supplementary Table 5. Within cluster 1 (dysexecutive neurophenotype), there was marginal improvement in complex attention (p=0.06) and signi cant improvement in cognitive exibility (p=0.01) but no change in subjective symptom report. Cluster 2 (memory-speed impaired neurophenotype) had no signi cant changes in objective cognitive test score, but they did self-report improved attention (p=0.03). Cluster 3 ("normal cognition" neurophenotype) signi cantly improved in the domains of verbal memory (p=0.01) and psychomotor speed (p=0.003) but had no change in subjective symptom inventory scores.
Comparison of functional outcomes between clusters at the 6-month chronic recovery stage revealed that cluster 2 (memory-speed impaired neurophenotype) had worse functional outcomes (MOS-36 scores) compared to the other two clusters (Table 2), with particularly strong effects for energy/fatigue, general health, and health change (i.e., decline in health relative to one year ago). There were no differences in PTSD PCL-17 scores between clusters.

Discussion
In the current study, we extracted three distinct neurophenotypes from multivariate neuropsychological data collected in adults recovering from SARS-CoV-2 infection. Risk factors and 6-month recovery outcomes were distinct across neurophenotypes, which provides preliminary validation of this approach. Several ndings emerged that can potentially be used to guide evaluations of post-COVID patients and clinical trials of therapeutics designed to target the cognitive sequelae of long COVID.
First, most participants (69%) performed within normal limits on objective cognitive measures during the post-acute recovery stage. These participants were classi ed in the "normal cognition" cluster, although they did report mild severity inattention, fatigue, memory, and pain complaints. Such complaints are often su cient to prompt evaluation in post-COVID care clinics (Graham et al., 2021), particularly if there is subjective experience of health change/decline. On average, this neurophenotype showed improvement in memory and psychomotor speed over time, although this may have been at least partially due to practice effects. Membership in this group predicted normal functional outcomes 6 months after SARS-CoV-2 infection, which is a point that can be used to counsel patients with mild post-COVID neuropsychiatric complaints who perform normally on objective cognitive testing.
Second, we found a rate of cognitive impairment (31%) among our participants that is consistent with that reported in the literature ( that are part of the suspected neural-mucosal CNS entry route (Meinhardt et al., 2021) and are proximal to regional atrophy patterns implicated by neuroimaging of living patients, such as the piriform cortex, parahippocampal gyrus, and orbitofrontal cortex (Dondaine et al., 2022;Douaud et al., 2022), all of which are known to support memory and neuropsychiatric functions. An increasing number of studies also establish the in ammatory consequences of COVID-19 within the central nervous system (Vora et al., 2021). Bio uid biomarkers of astroglial activation (YKL-40) and pro-in ammatory cytokines (e.g., IL-1β, IL-6, IL-8, and TNF-α) distinguish cases from healthy uninfected controls (Pilotto et al., 2021), while markers of neuroaxonal loss (e.g., neuro lament light, total-tau) rise in proportion with disease severity, with higher levels identifying patients with worse outcomes at hospital discharge (Virhammar et al., 2021;Prudencio et al., 2021). Collectively, these ndings suggest that post-COVID cognitive sequelae in the memory-speed impaired cluster may arise from the combined direct and indirect effects of COVID-19 infection on the brain. Surprisingly, younger individuals had a higher risk of membership in the memory-speed impairment cluster. This has two important implications. One is that the memory impairment in this group is unlikely to re ect unmasking of an incipient age-related neurodegenerative disease. The second is that these are individuals who would be otherwise working, raising families; thus, persistent cognitive impairment in this cohort is likely to result in greater functional impairment, raising per capita and indirect costs of disability, similar to what has been documented in conventional brain injury groups (Lo et al., 2021). For these young patients, early and intensive cognitive rehabilitation efforts are essential, not just for recovery and community integration, but for minimizing the nancial impact of COVID-19 infection.
The dysexecutive neurophenotype was characterized by impairment in complex attention and cognitive exibility. This was a milder neurophenotype that showed recovery over six months in complex attention and cognitive exibility. The base rates for impairment in complex attention dropped from 36% to 9.1% and for cognitive exibility from 52% to 9.1%. However, attrition may have in ated improvements, as those who completed 6-month follow-up had higher baseline complex attention and cognitive exibility than those who were lost to follow-up. Risk factors for cluster membership included COVID-nonspeci c factors such as neighborhood deprivation and obesity. Participants from communities with higher ADI scores are more likely to experience systemic disadvantage, potentially manifesting as reduced access to physical and mental healthcare, food insecurity, reduced exercise opportunities, more air pollution and unsafe housing, social discrimination, and increased worry about pandemic-related factors ( (Wang and Beydoun, 2007), which suggests that these may not be independent risk factors.

Treatment Considerations
Our ndings emphasize differences and similarities across patients with long COVID symptoms. Postacute neuropsychological pro les clustered into three distinct neurophenotypes, each associated with distinct risk factors and 6-month recovery outcomes. These ndings can inform phenotype-speci c approaches to treatment, highlighting the need for different treatment approaches rather than a "one size ts all" response to post-COVID symptoms. This is important for prudent programmatic resource allocation and nancial effect modeling within medical provider teams and for minimizing out-of-pocket expenses incurred by patients. Importantly, we found that more than two-thirds of patients ascertained from a hospital registry do not have objective cognitive impairment. For many, ine ciencies in attention and executive functioning resolved within six months of infection. For the normal and dysexecutive neurophenotypes, reassurance and lifestyle counseling will likely be essential to improve long-term wellness, along with public and private health initiatives to strengthen pandemic childcare policies, employee sick time policies, and healthcare access. Cognitive Behavioral Therapy (CBT) is also likely to provide bene ts for those reporting persistent anxiety, depression, insomnia, and fatigue (Adamson et al.,

2020).
For the memory-speed neurophenotype, a comprehensive interdisciplinary rehabilitation approach that cognitive rehabilitation specialists can inform accommodations to support a successful return to work, school, or community reintegration.

Limitations
A signi cant study limitation was high participant attrition. Although there were no signi cant differences in follow-up rates by cluster, the mild and severe neurophenotypes had smaller sample sizes than the normal cluster. Disproportionate cluster size was not predicted in advance due to the unknown nature of the disease. Results provide valuable information for prospective study planning. Larger cohorts will be necessary to obtain su cient sample size for the dysexecutive and memory-speed impaired neurophenotypes in future longitudinal outcome investigations. An additional limitation is that we did not evaluate whether participants received formal interventions or therapeutics between post-acute and chronic assessments; therefore, we cannot attribute recovery to the "natural course" of the disease.

Conclusion
The neurologic manifestations of long COVID present as distinct neurophenotypes with different risk factors and recovery trajectories. Future efforts should seek to replicate these neurophenotypes and their associated features in independent samples. It will be important to directly test whether the e cacy of various post-COVID therapeutics differs across neurophenotypes, given the high likelihood that different etiological factors contribute to post-COVID cognitive sequelae and in uence recovery.
Declarations Figure 1 Violin plots show cluster centers for each cognitive domain score from the CNS Vital Signs computerized test battery. Scores were age-adjusted based on a normative reference sample with a mean of 100 and standard deviation of 15. Cluster 1 shows impaired cluster centers in complex attention and cognitive exibility (dysexecutive group). Cluster 2 scores impaired cluster centers in verbal memory, psychomotor speed, and reaction time (memory-speed impaired group). Cluster 3 showed average/normal cluster centers for all cognitive domains (normal group).