Ancestry-attenuated effects of socioeconomic deprivation on type 2 diabetes disparities in the All of Us cohort

Background: Diabetes is a common disease with a major burden on morbidity, mortality, and productivity. Type 2 diabetes (T2D) accounts for roughly 90% of all diabetes cases in the United States and has greater observed prevalence among those who identify as Black or Hispanic. Methods: The aims of this study were to determine whether T2D racial and ethnic disparities can be observed in data from the All of Us Research Program and to measure associations of genetic ancestry (GA) and socioeconomic deprivation with T2D. The All of Us Researcher Workbench was used to calculate T2D prevalence and to model T2D associations with GA, individual-level (iSDI) and zip code-based (zSDI) socioeconomic deprivation indices within and between participant self-identified race and ethnicity (SIRE) groups. Results: The study cohort of 86,488 participants from the four largest SIRE groups in All of Us: Asian (n=2,311), Black (n=16,282), Hispanic (n=16,966), and White (n=50,292). SIRE groups show characteristic genetic ancestry patterns, consistent with their diverse origins, together with a continuum of ancestry fractions within and between groups. The Black and Hispanic groups show the highest median SDI values, followed by the Asian and White groups. Black participants show the highest age- and sex-adjusted T2D prevalence (21.9%), followed by the Hispanic (19.9%), Asian (15.1%), and White (14.8%) groups. Minority SIRE groups and socioeconomic deprivation are positively associated with T2D, when the entire cohort is analyzed together. However, SIRE and GA both show negative interaction effects with SDI on T2D. Higher levels of SDI are negatively associated with T2D in the Black and Hispanic groups, and higher levels of SDI are negatively associated with T2D at high levels of African and Native American ancestry. Conclusion: Socioeconomic deprivation is positively associated with the SIRE group T2D disparities observed here but negatively associated with T2D within the Black and Hispanic groups that show the highest T2D prevalence. These results are paradoxical and have not been reported elsewhere. We discuss possible explanations for this paradox related to the nature of the All of Us data along with SIRE group differences in access to healthcare, diet, and lifestyle.


Introduction
Diabetes is a pervasive and costly disease in the United States and beyond, affecting over 34.2 million Americans (1) and costing billions of dollars annually in both healthcare expenses and reduced productivity (2).Of those a icted with diabetes in the United States (US), type 2 diabetes (T2D) accounts for roughly 90% of cases (3).T2D disproportionately affects minority racial and ethnic groups in the US, with greater prevalence rates being observed among Hispanic, non-Hispanic Asian, and non-Hispanic Black individuals than among non-Hispanic White individuals (1,4,5).In addition to minority race and ethnicity, it has been demonstrated that socioeconomic deprivation increases one's risk of developing T2D (6, 7).T2D risk has also been shown to vary with genetic ancestry, with European ancestry being associated with lower odds of developing T2D and African and Native American ancestry being associated with greater odds of developing T2D (8)(9)(10)(11).
The causes of T2D are complex and span a multitude of genetic, demographic, social, and environmental factors that may modulate one another (12,13).As such, insight into the epidemiology of T2D may be gleaned from observing how these factors act in concert to affect disease risk.One promising source of data from which such insight may be obtained is the United States (US) National Institutes of Health (NIH) All of Us Research Program (abbreviated as All of Us hereafter).
All of Us was launched in 2015 with the mission of advancing individualized healthcare and health equity through the collection of genetic and health data from thousands of individuals across the US (14).The program places a special emphasis on recruiting participants from populations that have been historically underrepresented in health research.The resulting diverse participant cohort provides rich opportunities to investigate how self-identi ed race, ethnicity, and genetic ancestry, as well as their respective interactions with socioenvironmental factors, in uence T2D risk in the US.
The rst aim of this study was to determine whether there is evidence for T2D racial and ethnic health disparities in the All of Us cohort.The second aim was to assess whether and how SED, genetic ancestry, and their respective interactions contribute to T2D disparities and whether these contributions are consistent with what is known about how genetic and environmental factors in uence T2D risk.We previously observed synergistic effects between socioeconomic deprivation and genetic ancestry in increasing T2D risk among ethnic minorities using data from the United Kingdom Biobank (15,16).The results of this study may reveal T2D disparity risk factors that could be potential targets of policy measures that aim to ameliorate T2D disease burden in the US.This study could also provide insight into the potential of All of Us as a resource for investigating epidemiological questions and health disparities.

Study Cohort
The cohort for this study was assembled from participant data made available through the All of Us Researcher Workbench, a cloud-based platform through which registered researchers can access and analyze All of Us participant data.All of Us volunteer participants were enrolled electronically or through a healthcare provider organizations.The participant cohort consists of adults aged 18 years and older who reside in the US or in a US territory.Those who were either incarcerated or lacked the capacity to consent at the time of enrollment were excluded from the program.The All of Us operational protocol (#2016-05) was approved by the NIH Institutional Review Board.
Participant data consists of three datasets corresponding to different access tiers.The public tier dataset exclusively contains aggregate data and is freely accessible to all.The registered tier dataset consists of de-identi ed, individual-level data and is restricted to approved researchers.The controlled tier dataset consists of individual-level genomic data and expanded electronic health record (EHR) information.The study cohort was built using the All of Us Registered Tier Dataset v6 (curated version R2022Q2R2) and the All of Us Controlled Tier Dataset v6 (curated version C2022Q2R2).These datasets consist of data collected from participants who enrolled from 2018-2021, with a data cutoff date of 1/1/2022 and a data release date of 6/22/2022.Participant EHR, demographic, and socioeconomic data were obtained from the Registered Tier dataset.Participant genomic was obtained from the Controlled Tier dataset.Demographic data consisted of self-identi ed race and ethnicity (SIRE), date of birth, and sex at birth.International Classi cation of Diseases codes (ICD-9cm and ICD-10cm) were extracted from participant EHR data and mapped to phecode 250.2 to classify individuals as T2D cases and controls (17).
Following the phecode convention, patients who exhibited phenotypes corresponding to phecodes 249-250.99 were excluded from the study cohort, as they bore conditions too similar to T2D to be reliably assigned as controls.

Race and Ethnicity
Following enrollment, All of Us participants responded to a number of surveys spanning topics concerning their background, lifestyle, and health.In a core survey titled "The Basics", participants are asked to select one or more of seven main racial or ethnic categories that best describe them: (1) American Indian or Alaska Native, (2) Asian, (3) Black, (4) Hispanic or Latino, (5) Middle Eastern or North African, (6) Native Hawaiian or Paci c Islander, and (7) White.Additionally, participants could respond with "None of these fully describe me" or "Prefer not to answer".The All of Us Researcher Workbench provides these data as self-identi ed race and ethnicity categories, following the US O ce of Management and Budget Standards.This information is currently unavailable for individuals who selfidenti ed as American Indian or Alaska Native.Our study cohort consists of four largest SIRE categories for All of Us participants: Asian, Black, Hispanic, and White.We de ned non-Hispanic Asian, Black, and White participants as those who selected these respective racial categories in the "The Basics" core survey and no other racial or ethnic category.We de ned Hispanic participants all who selected "Hispanic or Latino".

Quantifying Socioeconomic Deprivation
To quantify socioeconomic deprivation, we used a zip code-based socioeconomic deprivation index (zSDI) devised by Brokamp et al. (18).zSDI values are available for most participants in the All of Us Researcher workbench.Further motivating our decision to use zSDI as a metric for deprivation is the demonstrated role of geographic location in driving and perpetuating health disparities (19).The zSDI is based on six different socioeconomic variables included in the 2015 American Community Survey.The values of these variables are de ned for census tracts and include (1) the proportion of the population whose income in the past 12 months place them below the poverty threshold, (2) the median household income of the population, (3) the proportion of the population who are at least 25 years of age who have at least a high school level of education, (4) the proportion of the population with no health insurance coverage, (5) the proportion of households that received public assistance income or food stamps within the past 12 months, and (6) the proportion of households that are vacant.The zSDI represents the value of the rst principal component resulting from a principal component analysis performed on these six measures.The zSDI ranges from 0 to 1, with a higher value corresponding to higher levels of socioeconomic deprivation.
All of Us Researcher Workbench survey questions related to socioeconomic status were used to create an individual-level socioeconomic deprivation index (iSDI) comparable to the to the area-level deprivation index created by Brokamp et al.The following questions from All of Us's "The Basics" survey were used to calculate iSDI: (1) "What is your annual household income from all sources?", (2) "Do you own or rent the place where you live?", (3) "Are you covered by health insurance or some other kind of health care plan?", (4) "What is your current employment status?", and ( 5) "What is the highest grade or year of school you completed?".We coded participant responses to these questions as ordinal variables and performed dimensionality reduction using principal component analysis of the ordinal values to compute iSDI for individual All of Us participants.Like the zSDI area-level deprivation index created by Brokamp et al., this index ranges from 0 to 1, with greater values corresponding to higher levels of deprivation.

Genetic Ancestry
Genome-wide genotype data was made available for 165,080 participants through the All of Us Controlled Tier Dataset v6.Genotype variants were called for 1,824,517 genomic positions on the GRCh38/hg38 reference genome build using the Illumina Global Diversity Array.We harmonized All of Us participant genotype variants with whole genome sequence variant data from global reference populations characterized as part of the 1000 Genomes Project and the Human Genome Diversity Project (20,21).Biallelic variants that were shared across the All of Us and the global reference population variant sets were merged, ensuring consistency between reference and alternate allele designations.
The Rye (Rapid ancestrY Estimation) program was used to infer participant genetic ancestry from the nal variant dataset (23).Rye performs genetic ancestry inference by utilizing principal component analyses of genomic variant data.Principal component analysis was performed on the harmonized variant dataset using the FastPCA program implemented in PLINK version 2.0 (24,25).The resulting data was used to de ne reference samples representing six continental ancestry groups: African, Asian, European, Native American, Oceanian, and West Asian.Rye was run on the rst 25 principal components of the data, assigning ancestry group fractions to each participant.

Statistical Analysis
Due to the overrepresentation of older and female participants in the study cohort, SIRE-speci c T2D prevalence estimates were adjusted for age and sex.For every SIRE group, unadjusted T2D prevalence, p, was taken to be K cases over n total individuals belonging to the SIRE group.Age/sex-adjustment was performed by weighing the unadjusted prevalence for groups of participants corresponding to different age-sex combinations using census fractions, f, calculated from American Community Survey 1-year estimates.Census fractions, in this case, are the proportion of the total US population of a SIRE group that falls into a particular age-sex category.From different age-sex categories, c, adjusted prevalence, p, was calculated like so: 95% con dence intervals for each adjusted SIRE-speci c prevalence estimate were calculated by adding and subtracting the product of each adjusted estimate's standard error, , and 1.96 to and from the adjusted prevalence estimate: All T2D model analyses were performed in R version 4.2.2 using the stats package (26).Multivariable logistic regression was used to investigate associations between risk factors such as SIRE, genetic ancestry, and SDI on T2D case status (case = 1, control = 0).Participant age and sex were included as covariates in all models.Regression models were constructed using the glm function in R. To assess the effects of geographical clustering on our results, multilevel models were constructed using the glmer function from the lme4 package in R. Interaction effects were also deduced with the glm function.These effects were visualized using the plot_model function from the sjPlot R package (27).Speci cations for all models can be found in the titles for their corresponding tables (Tables 2-4).

Results
The cohort for this study was selected from All of Us participants who have EHR data available and who fell outside the exclusion range for T2D, as de ned by the phecode exclusion scheme (Supplementary Figure 1).Individuals in the cohort were restricted to those whose survey responses designated them to one of four SIRE groups -Asian, Black, Hispanic, and White -which represent the four largest racial or ethnic categories in the United States.To better understand the role of genetic ancestry, socioeconomic deprivation, and sex play in T2D risk, the study cohort was further restricted to individuals for whom genomic and socioeconomic data were available and whose sex at birth was either male or female.Our nal cohort consisted of 86,488 individuals whose mean age was 54.3 and of whom 64.78% were female (Table 1).
All of Us participant genomic variant data were analyzed together with variant data from global reference populations to infer participants' genetic ancestry fractions for six continental ancestry groups: African, Asian, European, Native American, Oceanian, and West Asian (Supplementary Table 1 and Supplementary Figure 2).Participant genetic ancestry fractions, strati ed by SIRE groups, are shown in Figure 1A.SIRE groups show characteristic ancestry patterns, together with continua of ancestry fractions within and between groups.Those who self-identi ed as belonging to the Asian SIRE group were predominantly of Asian ancestry, the Black SIRE group of African ancestry, and the White SIRE group of European ancestry (Figure 1B).Some exceptions to this pattern can be noted however, as in the case of certain individuals belonging to the White SIRE group who were mostly of West Asian ancestry, and in the case of certain individuals belonging to the Black and Asian SIRE group who were mostly of European ancestry.In comparison to the Asian, Black, and White SIRE groups, individuals belonging to the Hispanic SIRE group demonstrated great heterogeneity in their ancestry fractions, with substantial European, Native American, and African components.This is consistent with the fact that participants with Hispanic ethnicity can identify with any race, following the current OMB standards.
All of Us participant socioeconomic deprivation was measured using a composite, place-based index (zSDI) that includes information on income, education, housing, and public assistance.We observe a clear disparity in zSDI across the four SIRE groups, with those self-identifying as Black and Hispanic exhibiting the highest median zSDI (0.35), followed by the Asian (0.31), and White (0.30) groups (Figure 1C; ANOVA, F = 4793, p <2e-16).
Participant T2D diagnoses gleaned from EHR were used to calculate prevalence values for each of the four SIRE groups (Figure 2A).Of these four groups, the Black SIRE group demonstrated the highest adjusted prevalence percentage (21.87%,CI: 0.60) with the Hispanic SIRE group following closely behind (19.92%,CI: 0.58).The Asian (15.14%,CI: 1.37) and White (14.80%,CI: 0.32) SIRE groups exhibit the lowest adjusted prevalence percentages.The relative T2D prevalence values among SIRE groups are similar to what is seen when different methods are used to create the cohort from All of Us data and resemble the pattern of T2D disparities reported by the Centers for Disease Control and Prevention (CDC; Supplementary Table 2).
To further investigate the association between T2D risk and SIRE, we modeled T2D case / control status as a function of SIRE, using age and sex as covariates (Figure 2B).In this model, the White SIRE group was used as a reference group.The results of this model revealed that belonging to the Hispanic as opposed to belonging to the White SIRE group conferred the greatest increase in the odds of T2D (OR: 2.46, CI: 2.35-2.58),followed by belonging to the Black SIRE group (OR: 2.42, CI: 2.32-2.53)and followed last by belonging to the Asian SIRE group (OR: 1.31, CI: 1.16-1.48).Additionally, increasing age (OR: 1.04, CI: 1.04-1.05) is associated with greater predicted T2D risk and being female (OR: 0.81, CI: 0.78-0.84) is associated with lower predicted T2D risk.
A similar analysis was performed to elucidate the association between T2D risk and genetic ancestry.In this analysis, we modeled T2D case / control status as a function of a particular genetic ancestry fraction, using age and sex as covariates (Table 2).A model with these speci cations was generated for four genetic ancestry groups that our four SIRE groups of interest are closely associated with: Asian ancestry (Asian SIRE), African (Black SIRE), European (White SIRE), and Native American (Hispanic SIRE).African ancestry has the highest positive coe cient (0.21), suggesting that there is an increased level of T2D risk among those with greater African ancestry fractions.Native American ancestry has the secondhighest coe cient (0.14).Asian (-0.10) and European (-0.19) have negative coe cients, suggesting that there is lower T2D risk among those with a greater proportion of these ancestry fractions.All of these coe cients are signi cant at an alpha of 0.05.These patterns largely remained and were ampli ed when SDI was controlled for (Supplementary Table 3).
Additional models were created to investigate the association between T2D risk and socioeconomic deprivation, which modeled T2D case / control status as a function of area-based zSDI and individuallevel iSDI, using age and sex as covariates.Participant area-based (zSDI) and individual-level (iSDI) socioeconomic deprivation are signifantly correlated (r=0.26,p=<2.2e-16).As would be expected, the model returned a high, positive coe cient for zSDI (2.52) and iSDI (1.99), indicating greater odds of T2D at greater levels of both area-based and individual-level socioeconomic deprivation.Multilevel modeling with iSDI as a xed effect and zSDI as the random effect returned a high, positive coe cient for iSDI (1.98), suggesting that indivdual-level socioeconomic deprivation remains tightly associated with T2D risk when controlling for zip code clustering (Table 2).
Additional multivariable logistic regression models were created to investigate how SIRE, genetic ancestry, and SDI interact to modify predicted T2D risk.As part of this analysis, we modeled T2D case / control status as a function of either the SIRE-zSDI or GA-zSDI interaction terms, using age and sex as covariates.The SIRE-zSDI models returned signi cant and negative interaction coe cients for the Black-zSDI (-1.67) and Hispanic-zSDI (-1.40) interaction terms, suggesting that greater socioeconomic deprivation is associated with reduced risk of T2D for individuals belonging to either the Black or Hispanic SIRE groups (Table 3 and Figure 3A).However, when restricting the cohort to native-born participants, the Hispanic-zSDI interaction term is no longer signi cant (Supplementary Table 4).Relative excess of risk interaction (RERI) values for the Black (-4.02) and Hispanic (-3.66) groups are also negative, indicating subadditive effects of SIRE and zSDI.The opposite trend is observed for individuals belonging to the Asian and White SIRE groups, in which greater socioeconomic deprivation is associated with a greater risk of T2D.The negative interactions observed between Black and Hispanic SIRE and socioeconomic deprivation can also be seen when individuial-level iSDI is used to model T2D outcomes (Supplementary Table 5).
Similarly signi cant and negative interaction coe cients were returned by the GA-zSDI models, speci cally for the African-zSDI (-3.59),Asian-zSDI (-2.90), and Native American-zSDI (-4.84) interaction terms (Table 3).In contrast, a signi cant positive interaction coe cient was reported for the European-zSDI (1.34) interaction term.RERI values show the same trends, with negative values for African-zSDI (-7.76),Asian-zSDI (-6.23), and Native American-zSDI (-10.48),compared to positive RERI for European-zSDI (2.18).Visualization of the GA-zSDI interactions shows that high zSDI is a risk factor for T2D at low levels of non-European ancestry and this trend switches at high levels of non-European ancestry, where low zSDI has higher predicted risk (Figure 3B-E).This pattern is particularly pronounced for African and Native American ancestry.The pattern of all of these interaction effects remain when the cohort is strati ed by sex (Supplementary Tables 6 and 7; Supplementary Figures 3 and 4).
These results were validated with another set of models which modeled T2D risk as a function of zSDI, using age and sex as covariates (Table 4).These models were each run on a different subset of the study cohort consisting exclusively of individuals from one of the four SIRE groups under investigation.The models corresponding to the Black and Hispanic SIRE cohorts returned negative coe cients for zSDI (-0.70, -0.41).The model corresponding to the White SIRE cohort returned a signi cant and positive coe cient for zSDI (0.77), consistent with the coe cients observed for the SIRE-zSDI interaction terms.

Discussion
The patterns observed in the SIRE group-speci c T2D prevalence estimates align closely with patterns observed in other calculations of T2D prevalence (Supplementary Table 1).Across the different methods we used to calculate prevalence estimates using All of Us data, T2D prevalence was consistently highest among individuals identifying as Black and second highest among individuals identifying as Hispanic.Across almost all the methods we have used to calculate T2D prevalence, prevalence was lowest among individuals identifying as White and second highest among individuals identifying as Asian.These disparities are consistent with group-speci c T2D prevalence estimates shown in the CDC National Diabetes Statistics Report, supporting the use of All of Us to study T2D disparities.
This study validates many previous investigations into the association between factors such as genetic ancestry, SIRE, and T2D risk.Past observations have shown that populations that are more socioeconomically deprived and consist of more individuals identifying as Black or Hispanic (or African or Native American ancestry) suffer from a greater T2D disease burden than their White and European ancestry counterparts (7)(8)(9)(10)(11).The analyses that we performed on All of Us data reveal similar associations, underscoring the need for policy measures to alleviate the T2D disease burden of minority and socioeconomically deprived communities.What has been less thoroughly documented, however, is how interactions between genetic ancestry and socioeconomic deprivation are associated with T2D prevalence.
We previously explored interactions between socioeconomic deprivation and genetic ancestry using data obtained from the UK Biobank, a large-scale biomedical research resource that consolidates genetic and health data on roughly 500,000 participants from across the United Kingdom (15,16).These analyses revealed positive interaction effects between socioeconomic deprivation and ancestry, in contrast to the interaction effects reported here for the All of Us cohort, which returned signi cant negative interaction coe cients for all GA-zSDI interaction terms except European-zSDI.Negative interaction coe cients were also returned for all SIRE-zSDI interaction terms in this study.The negative Hispanic-zSDI term, however, is no longer signi cant when interaction analyses are run on a subset of the study cohort that exclusively consists of those native to the U.S.This may be a result of the healthy immigrant paradox-a phenomenon in which immigrants have positive health outcomes relative to their socioeconomic status.
This phenomenon was rst reported among Hispanic immigrants living in the Southwestern United States (28).The negative interactions between SIRE and zSDI are con rmed by zSDI effect size differences with and between groups.When the entire cohort is modeled together, zSDI is positively associated with T2D (Table 2), but zSDI is negatively associated with T2D for Black and Hispanic groups (Table 4).
That T2D risk may decrease with greater socioeconomic deprivation within minority groups contradicts prevailing knowledge concerning socioeconomic status and disease burden.Barriers to healthcare access faced by members of minority groups may provide one possible explanation for these paradoxical results.Minority racial groups in the United States tend to face greater obstacles when pursuing medical care, as such groups have lower levels of health insurance coverage and reduced quality of care, among other challenges (29).Presumably, these barriers are exacerbated by socioeconomic deprivation, so it may be the case that the most socioeconomically deprived members of minority groups are less likely to receive the level of medical care needed to receive a T2D diagnosis.Since our analysis relies on T2D diagnoses recorded in EHR, this could lower the observed T2D prevalence for the most socioeconomically deprived members of minority groups.If this conjecture is accurate, it would suggest that interventions are needed to expand healthcare access and encourage healthcare participation in socioeconomically deprived minority communities.One additional explanation for the paradoxical trends observed here may be that the amounts of discrimination that racial minority groups are exposed to are greater at higher levels of socioeconomic status (30).A previous study assessed interaction effects between race, education, and education on the release of C-reactive protein (CRP), a biomarker of in ammation and general stress.The results of the study revealed that more highly educated individuals who identify as Black had elevated levels of CRP compared to their less-educated counterparts.Such differences were not as pronounced among individuals who identify as White.These ndings suggest that due to higher levels of discrimination among those of greater socioeconomic status, better socioeconomic status may not strongly bene t the health of racial minority groups.
It could also be possible that, paradoxically, socioeconomic deprivation is a protective factor against T2D within minority racial groups.We have previously observed this phenomenon for an African ancestry population in Colombia, where those facing the most extreme forms of poverty have subsistence diet and lifestyle factors that contribute to a lower risk of T2D (11).Similar occurrences may be contributing to lower T2D prevalence among the most deprived in the All of Us cohort.Those on the extreme ends of socioeconomic deprivation may also suffer from the most severe forms of food insecurity, which could contribute to a lower risk of T2D.Furthermore, as many social programs such as Medicaid and SNAP impose income cutoffs, the most deprived individuals may have access to medical and nutritional resources that moderately deprived individuals lack.A de nitive resolution to this paradox will require further investigation.
There are several potential limitations to this study.As this is an observational study, a number of unobserved confounding variables may be present.Factors concerning lifestyle and diet, which are known to in uence T2D risk, for instance, may covary with the three variables under investigation -SDI, genetic ancestry, and SIRE.Furthermore, population biobanks are prone to volunteer bias, whereby older, healthier, and less disadvantaged individuals tend to participate.This has been seen for the UK Biobank, but the extent of volunteer bias of All of Us is currently unknown (31).Finally, the de nition of T2D case used here relies on diagnosis codes from EHR, which could lead to imprecise phenotyping, with potential differences across SIRE groups and levels of socioeconomic deprivation.
In conclusion, these results provide evidence for the existence of T2D disparities in the All of Us participant cohort.While T2D burden differs broadly across racial, genetic, and socioeconomic lines in ways previously reported, interactions between these variables reveal an unexpected interplay between genetic ancestry and socioeconomic status.The paradoxical relationship between socioeconomic status and T2D risk observed in Black and Hispanic individuals, though potentially a result of limited data, may re ect broader societal issues such as discrimination and inadequate access to healthcare among racial and ethnic minority groups.Policy interventions and research aimed at reducing racial health disparities may bene t from taking such issues into account.

Declarations Ethics approval and consent participate
The All of Us operational protocol (#2016-05) is approved by the NIH Institutional Review Board.Written informed consent was obtained from all participants.All data available to researchers has had direct identi ers removed and has been further modi ed to minimize re-identi cation risks.Because the All of Us data were not collected speci cally for this study and no one on our study team has access to the subject identi ers linked to the data, this study is not considered human subjects research according to the NIH Revised Common Rule for the Protection of Human Subjects: https://grants.nih.gov/policy/humansubjects/hs-decision.htm

Availability of data and materials
All of Us participant data can be accessed and analyzed from the Researcher Workbench by registered users: https://www.researchallofus.org/data-tools/workbench/ Interaction between race/ethnicity, genetic ancestry, and socioeconomic deprivation (zSDI) on T2D prevalence.T2D prevalence estimates, and 95% con dence intervals, are taken from multivariable CDC: Centers for Disease Control and Prevention, EHR: electronic health record, GA: genetic ancestry, LD: linkage disequilibrium, NIH: National Institutes of Health, iSDI: individual-level socioeconomic deprivation index, zSDI: zip code-based socioeconomic deprivation index, SIRE: self-identi ed race and ethnicity, T2D: type 2 diabetes, US: United States.

Table 2 .
T2D, genetic ancestry, and SDI.T2D was modeled by ancestry, zipcode-based zSDI, and individual-level iSDI, controlling for age and sex.iSDI was modeled using single-level (iSDI-s) and multi-level (iSDI-m) models with iSDI-m as a xed effect and zipcode as the random effect.