Exploring genomic regions involved in bread wheat resistance to leaf rust at seedling/adult stages by using GWAS analysis

Background Global wheat productivity is seriously challenged by a range of rust pathogens, especially leaf rust derived from Puccinia triticina. Since the most efficient approach to control leaf rust is genetic resistance, many efforts have been made to uncover resistance genes; however, it demands an ongoing exploration for effective resistance sources because of the advent of novel virulent races. Thus, the current study was focused on detecting leaf rust resistance-related genomic loci against the P. triticina prevalent races by GWAS in a set of Iranian cultivars and landraces. Results Evaluation of 320 Iranian bread wheat cultivars and landraces against four prevalent rust pathotypes of P. triticina (LR-99–2, LR-98–12, LR-98–22, and LR-97–12) indicated the diversity in wheat accessions responses to P. triticina. From GWAS results, 80 leaf rust resistance QTLs were located in the surrounding known QTLs/genes on almost chromosomes, except for 1D, 3D, 4D, and 7D. Of these, six MTAs (rs20781/rs20782 associated with resistance to LR-97–12; rs49543/rs52026 for LR-98–22; rs44885/rs44886 for LR-98–22/LR-98–1/LR-99–2) were found on genomic regions where no resistance genes previously reported, suggesting new loci conferring resistance to leaf rust. The GBLUP genomic prediction model appeared better than RR-BLUP and BRR, reflecting that GBLUP is a potent model for genomic selection in wheat accessions. Conclusions Overall, the newly identified MTAs as well as the highly resistant accessions in the recent work provide an opportunity towards improving leaf rust resistance. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-09096-1.

varieties is imperative to protect against yield loss due to leaf rust.
Albeit about eighty genes have been recorded for leaf rust resistance, only some have been successfully exploited in wheat breeding [6,7]. Most of these genes cause hypersensitive response upon infection due to qualitative race-specific resistance, also called seedling resistance (SR), where the host's resistance and the pathogen's avirulence genes cause incompatible interaction. Such resistance in crops is short-lived because of its failure via a novel virulent race [3]. In contrast, adult plant resistance (APR) genes confer quantitative resistance, which is long-lived, race-non-specific, and controlled by small impact genes, although there are some exceptions, e. g, Lr13 and Lr37 are race specific but APR [8]. Most resistance genes confer SR and are race-specific, except for a few genes, such as Lr68, Lr67, Lr46, and Lr34, which are race-non-specific and APR [3]. The durability and longevity of leaf rust resistance can be strengthened by a combination of both APR and SR genes [1]. Of course, the long-term use of cultivars with single major Lr genes along with the selection pressure on pathogens results in the manifestation of novel races [9]. This highlights the increasing requirement to search for novel resources of resistance genes towards enhancing resistance to emerging races [10]. To date, some researchers have indicated that potential resistance sources can be found in wild relatives, landraces. As a result, such diversities need to be harnessed for identifying new resistance to being used in wheat breeding programs [11].
Albeit plenty of QTLs for leaf rust resistance have been uncovered by linkage mapping, however, there are restrictions related to this mapping, such as low resolution, less diversity sampled, and the long time for establishing a bi-parental population [12,13]. As an alternative approach, Genome wide association mapping (GWAS) was found helpful for detecting genes/QTLs for wheat resistance to leaf rust. This approach is time-consuming for discovering marker-trait associations (MTAs) or genes/QTLs in natural populations. The uncovering of QTLs in such populations leads to high mapping resolution since they harness all historical recombination events [12]. Various characteristics in wheat, including agro-morphological traits [14], resistance to stripe rust [15], stem rust [16], powdery mildew [17], and fusarium head blight [18] have been dissected successfully by association mapping. There are some reports on mapping QTL/genes involved in bread wheat resistance to leaf rust [9,[19][20][21][22][23], with a potential to marker-assisted selection (MAS).
To apply GWAS, various statistical models, singlelocus (MLM) and multi-locus (mrMLM), have been adopted [24][25][26]. The mrMLM model accounts for large impact loci adequately, whereas the MLM model does not address this issue [25]. The multi-locus model is more reliable than MLM for mapping QTL/genes since all marker impacts are concomitantly estimated and do not need the test of MTAs using rigorous multiple corrections, which leads to the discarding of potential MTAs.
Iran is one of the countries in the Fertile Crescent region, which is known as the center origin and diversity of wheat. Additionally, previous studies have indicated that the center of origin of P. triticina is likely somewhere in the Fertile Crescent region in southwest Asia, where both sexual and asexual reproduction prevail [27]. Therefore, this region could provide an opportunity for natural selection and maintenance of resistance accessions. So, the purposes of this research were to detect genes/QTLs related to leaf rust resistance on a diverse panel of wheat cultivars and landraces originating from several geographical areas in Iran at the adult and seedling growth stages; and compare the resistance genes/QTLs discovered in the recent work with known genes/QTLs.

Phenotypic assessment
The distribution of phenotypic response of 320 Iranian bread wheat accessions to four leaf rust pathotypes (LR-99-2, LR-98-12, LR-98-22, and LR-97-12) is presented in Fig. 1, suggesting most wheat genotypes are susceptible. Wheat genotypes that are resistant to all four Puccinia triticina (Pt) isolates are presented in Table S7. For normality, the SW test indicated a normal distribution for the phenotypic data. Thus, homogeneity within experiments was evaluated by the Levine's test, which exhibited that the phenotypic variances are homogenous for all pathotypes (P = 0.35 to 0.92) ( Table S2). As a result, for each wheat accession, the entire mean value was adopted in the association mapping. The H 2 of the leaf rust ITs ranged from 0.98 to 0.99 for the pathotypes, reflecting most phenotypic diversity can be justified via genotypic variation (Table S2, and S3).
From Pearson's correlation analysis, there are highly significant correlation coefficients (r) among all pathotypes for adult plant and seedling growth stages (Fig. 2). Correlation outcomes indicate that all traits evaluated in the current study (i.e., evaluation parameters of rust infection) can be adopted for association mapping analysis.
The virulence/avirulence profile of the pathotypes's reactions revealed differences among pathotypes for virulence/avirulence genes (Table S4). Climatic and geographic information for collection locations of wheat leaf rust isolates are presented in Table S5. For pathotypes LR-99-2, LR-98-12, LR-98-22, and LR-97-12, the resistant ITs were recorded in 5, 9.7, 4.4, and 4.4% of the cultivars, respectively, and the resistant ITs in 2.5, 6.6, 7.2, and 6.2% of the landraces, respectively (Table S6). Pathotypes Lr-99-2 and Lr-97-12 were the most virulent on 27.8 and 65.6% of susceptible cultivars and landraces, respectively, while pathotypes Lr-98-12 was the lowest virulent on 22.19 and 60.63% of susceptible cultivars and landraces, respectively. Under field conditions, the adult plant response of the wheat panel was assessed by coefficient of infection (CI), final disease severity (FDS), and area under the disease progression curve (AUDPC) ( Table S8). The CI value, which is the product of FDS and AUDPC, was utilized for GWAS analysis. The CI of 20.3 and 23.7% of the wheat genotypes displayed resistance reaction to rust infection in the Ahwaz and Karaj locations, respectively. Furthermore, moderate and susceptible responses were found in 12.5% (Ahwaz) to 24.4% (Karaj) and 67.2% (Ahwaz) to 51.9% (Karaj) wheat genotypes, respectively. Thus, the presence of APR genes in the genetic background of the Iranian wheat accessions could provide resistance in the adult crops.

Genotypic assessment
From GBS results, a total of 566,439,207 unique reads were recorded with approximately 80% being non-redundant. After de-duplication and alignment, 133,039 SNPs could be called of which 10,938 had a MAF > 1%, H < 10%, and MR < 10% (ensuring quality checks). The ultimate data set consisted of 46,203 imputed SNPs, which were used in GWAS. The highest number of SNPs was detected in the B genome, particularly Chr.6B, Chr.2B, and Chr.3B (Fig. S1).

Linkage disequilibrium (LD)
The LD value varies both within and between subgenomes and chromosomes, often decreasing with increasing distance occurring between SNP locations (Table 1). A total of 1,830,925 marker pairs (MP) with r 2 = 0.21 were discovered in wheat cultivars, of which 38% had significant linkages at P < 0.001. The highest and lowest numbers of MPs were recorded in the B (51.8%) and D (11.2%) genomes, respectively. The highest LD was found between marker pairs located on the Chr4A, followed by Chr1D (Table 1).
Applying a similar exploration using the wheat landraces led to detecting 1,828,675 MPs with a mean r 2 of 0.18, which is lower than in wheat cultivars. A larger fraction of MPs, however, was found to be in significant LD (45.7%). The highest and lowest MPs were registered in the B (928,125) and D (233,075) genomes, respectively. The LD value was found strongest between MPs in the Chr4A, followed by the Chr2A (Table 1).  For estimating LD decay, the LD of 0.157 was determined as the cut-off. It was found that the LD value of the D genome was decayed in a faster manner, followed by the A genome in the wheat accessions panel. In the A genome, the mean LD for MPs was 0.16 at 2.4 Mb in contrast to 1.7 Mb in D chromosomes and 5.4 Mb in the B chromosomes (Fig. S2).

Population structure
The population structure analysis led to specifying three subpopulations with different levels of admixture (Fig. S3). The population structure matrix revealed the max value of ΔK for K = 3, demonstrating that the Iranian wheat accessions can be divided into three subpopulations.
The PCA was carried out, where PC1 and PC2 justified 16.9 and 6.3% of the diversity variance, respectively. The scatter plot of PCA indicated that the PC1 and PC2 could distinguish the three subpopulations of wheat accessions deriving from various wheat-cultivating areas (Fig. S3), which further supported the outcome of the Structure. Of course, there are some admixed accessions falling between the three subpopulations.

Marker Trait Association (MTAs)
Two GWAS models, MLM and mrMLM, were used to detect defense genomic regions against leaf rust pathotypes at the adult and plant seedling growth stages. As a result, 363 and 464 MTAs were discovered for resistance to various pathotypes by using MLM and mrMLM, respectively (P value < 0.001) ( Table S9). The mrMLM model appeared the more powerful relative to MLM in our analysis, indicating the max number of highly significant MTAs (74), while MLM was the least potent, as it found the lowest number of highly significant MTAs (39) ( Table 2). The highest number of associations was located on the B genome (57.29%) in Chr.1B, Chr.2B, Chr.3B, and Chr.6B (Fig. 3). Exploring both MLM and mrMLM algorithms led to the highest number of MTAs in LR-97-12 pathotype. The phenotypic diversity (R 2 ) for both adult and plant seedling growth stages ranged from 2.05% to 3.15%, showing that wheat resistance to pathotypes is modulated by several genomic loci with moderate to small impacts ( Table 2).

Pleiotropic MTAs
Generally speaking, crop response to various pathotypes is connected, and thereby complicated biological mechanisms are responsible for this coordination. In fact, the pleiotropic function of genetic regions on various pathotypes leads to a connection between pathotypes. In this experimental research, some MTAs were observed with pleiotropic impact on the wheat reaction to leaf rust pathotypes (Table 2; Fig. 4). For instance, the MTA rs7934 on Chr.3A was related to all four pathotypes.

Putative candidate genes
Gene annotation uncovered 80 highly significant MTAs inside protein-coding regions (Table S10), of which 16 reliable MTAs were considered ( Table 3). The genes harbouring MTAs mainly encode proteins involved in biological processes in the leaf rust-infected crops, including protein phosphorylation and ubiquitination. The QQ and Manhattan plots of highly associated haplotypes for leaf rust resistance are displayed in Fig. 5. Manhattan plots exhibited significant markers related to resistance to leaf rust pathotypes at P-value = 0.0001, as the significant cutoff.

Genomic prediction (GP)
The highest prediction accuracy was achieved for the GBLUP approach (Fig. 6). Overall, the genomic best linear unbiased prediction (GBLUP) model appeared better than ridge regression-best linear unbiased prediction (RR-BLUP) and bayesian ridge regression (BRR), highlighting that it is the preferable algorithm to use for genomic selection in the Iranian wheat panel.

Discussion
Exploring new resources of resistance QTLs/genes is an ongoing task and is imperative in crop improvement for combating the pathogen threat to productivity. These genes or QTLs can be pyramided towards the development of durable resistant varieties. To achieve these goals, GWAS is used successfully in wheat to discover genomic regions involved in leaf rust resistance at the adult plant and seedling stages [3,28]. Therefore, this study was aimed at detecting resistance genes/ QTLs/markers against leaf rust in Iranian bread wheat accessions.

Phenotypic variation for wheat resistance to leaf rust
Iranian wheat cultivars and landraces were screened for resistance to leaf rust at the adult plant and seedling stages under field and in a controlled situation, respectively. The crop panel exhibited a diversity in the leaf rust response in both growth stages. The distribution of leaf  [20,23], that emphasized on the importance of natural populations to find resistance sources for leaf rust.
From correlation analysis, significant relations were found between the infection types (ITs) of P. triticina pathotypes. This high correlation may originate due to several factors. The first evidence comes from a virulence profiling of wheat varieties harboring the Lr gene that exhibited all P. triticina races were virulent to Lr14a, Lr11, Lr3, and Lr1 [21]. Thus, it is expected correlations among the phenotypic data. Second evidence, the association panel in the current work likely had common QTLs conferring resistance to P. triticina pathotypes and this was verified further by association mapping, which allowed the discovery of common QTLs involved in resistance to all four pathotypes. A similar observation was observed earlier, where a high  24:83 correlation was recorded in the phenotypic reactions with multiple P. triticina races, followed by detecting common QTLs for resistance to those races [21].

Population structure
Since genetic variation is the critical factor for crop improvement endeavors [29], assessment of the population structure is a requirement for using a genetic resource in wheat manipulation [30]. From our observations, different analysis approaches agreed with the presence of three subpopulations and also consent with the geographic origin. Structure analysis revealed three subpopulations among the 320 Iranian wheat accessions and the outcome from the PCA also supports this grouping. Interestingly, the clustering pattern of wheat accessions was not in line with their origins or geographical distributions. It seems that this is likely due to the farmers' migration and germplasm exchange across institutes [3].

LD of marker pairs
A total of 18,932 SNPs was recorded on wheat chromosomes, especially in genome B, which possesses the highest density of markers, followed by the A and D genomes. A similar pattern has been also reported by others [31,32]. The higher variation in the B and A wheat genomes is likely the consequence of two factors [33], the older evolutionary history of these genomes, and gene flow from the species T. turgidum (but not Ae. tauschii) to common wheat. The fact that wheat cultivars show higher LD in contrast to landraces is likely the consequence of selection events during crop breeding practices. Overall, mating systems, recombination, genetic drift, population relatedness, mutation, and selection are all major elements influencing LD in wheat [34]. The LD decay rate was more rapid in the D genome than in others, which is in line with earlier reports [15].

MLM and mrMLM
The mrMLM model appeared the more powerful relative to MLM in our analysis, indicating the max number of highly significant associations, while MLM was the least potent, as it found the lowest number of highly significant associations. MLM single-locus model adopts a genome scan test one SNP at a time while needing multiple corrections (e.g., Bonferroni) for avoiding false positives. This process is too conservative and may lead to the loss of actual associations that are fundamental to the intended characteristic [25. Moreover, single-locus models cannot simultaneously estimate all marker impacts, and thereby cannot present a proper model for genetic mapping the quantitative properties, which are governed by the cumulative act of numerous genes [25]. For overcoming these challenges, the mrMLM multi-locus approach was also adopted for dissecting the molecular basis of stress tolerance in wheat [12].

Putative candidate genes
GWAS is a well-known molecular approach in crop breeding programs to discover MTAs related to pathogen resistance. To date, this approach has been utilized successfully in plants to find genomic regions involved in leaf rust resistance at the adult plant and seedling stages [23,28]. Our attempt led to detecting a total of 80 highly significant MTAs and 16 reliable MTAs for leaf rust resistance. The distribution of these MTAs was found cross 21 chromosomes except for Chr1D, Chr3D, Chr4D, and Chr7D. Earlier experimental evidence has also detected MTAs on almost 21 chromosomes [35], as both minor as well as major genes be involved in leaf rust resistance during adult plant and seedling. For example, it has been stated there are not any MTAs related to seedling/adult leaf rust resistance on Chr6A, Chr5B, Chr4D, Chr4B, and Chr4A on wheat accessions [36]. The location of MTAs detected in the recent work was compared with those of previous reports and the results were presented at Table S11. We must remind that for some of the MTAs, it is a difficult comparison across various studies because of the difference in the mapping population and marker platforms, as well as the absence of a consensus map for comparing MTA locations [1]. Also, the reliable comparison is possible only on the basis  [37][38][39].
From functional point of view, most genes responsible for leaf rust resistance included those coding proteins involved in Mg 2+ ion binding, terpene synthase activity, lyase activity, aspartic-type endopeptidase activity, etc. For instance, the putative candidate gene related to seedling resistance to LR-97-12 pathotype belongs to the protein serine/threonine kinases, which have a key function in disease resistance and recognition of pathogens [15]. In a similar research attempt, exploring MTAs related to seedling/adult leaf rust resistance on a panel of 400 wheat accessions led to the identification of candidate genes, such as serine-threonine and leucine-rich repeat receptor-like protein kinases, and P-loop containing nucleoside triphosphate hydrolases, which have a function in disease resistance and pathogen recognition [36]. The flanking SNP closely linked to leaf rust resistance can be converted to allele-specific markers in MAS to deliver these genomic regions into wheat breeding lines [4].
Among the ~ 80 Lr genes, most genes provide SR, and only a few genes such as Lr77, Lr68, Lr67, Lr46, Lr34, Lr22, Lr13, and Lr12 have been reported to harbor APR [41]. After passing the validation process, the markers associated with Lr13, Lr22, and Lr68 genes in this study can be used to pyramid the APR genes and improve leaf rust resistance in wheat high-yielding cultivars. Moreover, annotation of genes for APR uncovered the most similar proteins responsible for SR since both share the elicitor responses and signalizing cascades. A metaanalysis explored consensus genomic regions conferring wheat resistance to leaf rust by using 393 QTLs collected from 50 QTL mapping reports [3]. The finding was the detecting of 15 high confidence meta-QTLs, which can be used in MAS. For example, MQTL7B.3 is co-localized with both Lr14a and Lr68 genes. The SR gene Lr14a is presumed to have originated from emmer wheat and is related to resistance genes Lr68 and Sr17 (for powdery mildew as well as stem rust). Moreover, Lr14a can confer APR to most P. triticina races with low-to-medium ITs as well as is connected with necrosis. As a result, the MQTL7B.3 not only induces SR as well as APR but also provides a region of resistance to multiple diseases.

Genomic prediction
Generally speaking, the GP accuracy depends on several factors, including the LD levels, genetic diversity in populations, genomic selection methodology, and the genetic architecture of the trait [42]. In the recent study, we found that GBLUP appeared as better than RR-BLUP and BRR, reflecting that GBLUP is a potent model for implementing genomic selection in wheat accessions. From earlier studies, high prediction accuracy can be obtained by GBLUP if SNPs are tightly linked to the intended trait [12]. RR-BLUP works well for crop traits where the genetic architecture consists of multiple loci with small impacts, while the BRR is similar to RR-BLUP, except marker impact shrinkage depends on the size of the population in BRR. The better performance of GBLUP depends on the fact that SNPs in the recent work were closely linked with wheat resistance to leaf rust.

Conclusions
The current study was focused on 320 Iranian bread wheat cultivars and landraces against four prevalent races of P. triticina in Iran. GWAS analysis led to discovering 80 highly significant and 16 reliable MTAs for leaf rut resistance on almost all chromosomes. Among these, six QTLs including rs20781 and rs20782 (for LR-97-12), rs49543 and rs52026 (for LR-98-22), as well as rs44885 (for LR-98-22, LR-98-1, and LR-99-2) and rs44886 (for LR-98-22, LR-98-12, LR-99-2) were found on genomic regions where no Lr genes previously reported in wheat, suggesting new loci for leaf rust resistance. Other QTLs were uncovered in the adjacency of previously reported Lr QTLs/genes, thus, further analysis such as an allelism test is needed to specify their association. The rs44885 and rs44886 MTAs appeared to be reliable for resistance to all pathogenic races with the main impact and altogether justified up to 6% of the observational diversity.

Plant and pathogen materials
A total of 320 Iranian bread wheat accessions were obtained from the seed collection of University of Tehran -Agriculture and Natural Resources (UT-ANR), and the Seed and Plant Improvement Institute (SPII), Karaj, Iran (Table S1). The evaluation of the desired genotypes in the adult and seedling stages against the dominant pathotypes of leaf rust in the country (LR-99-2, LR-98-12, LR-98-22, and LR-97-12) was carried out in the cereal pathology greenhouse of the SPII. The field experiment with the common pathotypes of Khuzestan province was conducted at the Ahvaz station and the common pathotypes of Alborz province at the research field of UT-ANR. The study protocol must comply with relevant institutional, national, and international guidelines and legislation.

Seedling resistance evaluation
Cultivation practices and recording of the response of wheat accessions to the rust were carried out in the spring. The accessions were cultivated in the form of a randomized complete block design (RCBD), which was implemented with two replicates. In all experiments, for each accession, seven seeds were planted in a pot (containing peat moss, sand, and field soil) and kept in greenhouse conditions (at 21 °C and 65% relative humidity). Irrigation was also done by the leakage method. About 8-10 days after sowing the seeds, with the completion of the first leaf, the seedlings were ready for inoculation. Inoculation was done by using a mix of brown rust spores and Talc powder in a ratio of 1:4 by using a brush. After inoculation, the pots were kept for 24 h in a dark and cold room at 17 °C and saturated relative humidity. The pots then were transferred to greenhouses at 22 °C, photoperiod of 16/8 h of light/darkness, and relative humidity of 75%. About 10 days after inoculation, the infection types created on the wheat accessions were recorded based on a scale of 0 to 4 as follows [43]: O or immune, without any visible marks and pustules; H or hypersensitivity, the appearance of hypersensitive flecks in the form of necrosis and chlorosis without pustules; 1 or resistant, small pustules covered by necrotic spots; 2 or semi-resistant, small to moderate pustules covered by chlorosis or necrosis; 3 or susceptible, moderate-sized pustules that may be associated with chlorosis; 4 or very susceptible, small pustules without chlorosis and necrosis. Finally, the 0-4 scale was converted to 0-9 numerical scales, scores in the range of 0-7 were considered as resistant and above 7 as susceptible [44].

Adult plant stage evaluation
In the Ahvaz location, the genotypes were cultivated in the middle of December 2018 in the field research of Khuzestan Agriculture and Natural Resources, Ahvaz, Iran (latitude: 31˚20ꞌN, longitude: 48˚40ꞌE, and height from the sea level: 13 m). Each of the wheat accessions was cultivated on a one-meter line with a distance of 30 cm. After ten accessions, the susceptible variety Bolani was cultivated as a control. This susceptible variety was also cultivated around the experimental as the spreader of the disease and its over-receiver. This experiment was evaluated under sprinkler irrigation, which allows the application of water under high pressure with the help of a pump. Artificial inoculation was done using a mixture of brown rust pathotypes of the region (in equal proportion of each pathotype) starting from January 10 th and until February 25 th , once every 15 days, in the evening. After the uniformity of the appearance of the disease on the susceptible variety (Bolani), the severity of the flag leaf infection was recorded by using determining the percentage of leaf surface contamination (0-100%) [45] and also by determining the IT [46] on four occasions with 10-day intervals from the 6th of March. Scoring was as follows [44]: O or immune, without any symptoms; R or resistant, the appearance of chlorotic and necrotic band spots without pustules, or small and scattered pustules; MR or semi-resistant, the appearance of small rust pustules surrounded by necrotic spots; MS or semi-susceptible, the appearance of medium-sized pustules, without necrotic spots, sometimes with chlorotic spots; S or susceptible, the appearance of large rust pustules in abundance without chlorotic spots, sometimes with these spots.
When the disease severity in the susceptible variety Bolani reached 100%, the last recording (4 th ) in wheat accessions was considered as the final severity of infection FDS. The severity of the disease in each note-taking and its host reaction (infection types including R or resistant, MR or semi-resistant, MS or semi-susceptible, and S or susceptible) were combined to calculate the CI. That is, the CI was obtained from the product of the disease severity and the constant-coefficient related to the host reaction (S = 1, MS = 0.8, MR = 0.4, R = 0.2, and 0 = 0). From the obtained CI, the area under the disease progression curve (AUDPC) was estimated as follows: where t i , recording time t i ; t i+1 − t i : recording time t i + 1 m; y i : rust infection coefficient at the time of recording t i ; y i+1 : rust infection rate at the time of recording t i+1 ; N: the number of records to assess the disease severity.
In the Karaj location, the genotypes were cultivated in the middle of November 2019 in the research field of University of Tehran (latitude: 36˚00ꞌN, longitude: 48˚40ꞌE, and height from the sea level: 1137 m). Each of the wheat accessions was cultivated on 2 m lines with a distance of 30 cm. Artificial inoculation was done using a mixture of brown rust pathotypes of the region (in equal proportion of each pathotype) starting from March 30 th and until May 15 th , once every 15 days, in the evening. After the uniformity of the appearance of the disease on the susceptible variety (Bolani), the severity of the flag leaf disease was recorded by using determining the percentage of leaf surface contamination (0-100%) [45] and also by determining the infection type [46] on four occasions with 10-day intervals from the 6th of March.

Statistical analysis
The phenotypic data were analyzed via SAS V 9.4. Pearson's correlation was estimated using the corrplot package in R. For determining if the phenotypic data for each race was normally distributed, the Shapiro-Wilk (SW) test was carried out using the Proc-Univariate procedure. Moreover, for checking the data homogeneity, Levene's test was conducted. The entire mean was utilized for the association analysis if the data were homogenous. To calculate H 2 (broad-sense heritability), genotypic variation was divided by the sum of error, block, and genotypic variances.

Genotyping and SNP imputation
To genotype Iranian bread wheat cultivars and landraces by genotyping-by-sequencing (GBS), the GBS library was developed and sequenced [47,48]. BLAST, which permits for mismatches up to 3 bp, was exploited to discover SNPs. These polymorphisms were called by TASSEL using the UNEAK pipeline [49]. To avoid false-positive SNPs, only those with a missing rate (MR) < 10% across samples, heterozygosity (H) < 10%, and minor allele frequencies (MAF) > 1% were included. Missing data were imputed via TASSEL using the LD KNNi [49]. As the reference genome, the W7984 bread wheat genome was adopted to call SNPs [50].

Population genetic analysis
Population structure was determined using a mixed model in the Structure [51]. The number of subpopulations (K) varied from K = 1 to K = 10 and for each value, ten independent runs of 10,000 burn-in and 10,000 MCMC steps were done. The most likely K value was specified via the ΔK in the Structure harvester [52]. The matrix of population structure, i.e. Q, was obtained for the whole population from the Structure analysis for the best value of K [49]. The kinship matrix (K) was also obtained using the EMMA in the R [53]. To support the Structure outcomes, a principal component analysis (PCA) was applied to the SNPs using the Tidyverse in the R [50].

LD
From the value of expected/observed allele frequencies, LD occurred among SNPs was estimated in TASSEL V.5.
The pairwise LD was calculated using the squared correlation coefficient of alleles (r 2 ). The full matrix option was used to estimate the LD distribution for each subpopulation and in the whole panel. From the theoretical expectation of r 2 , LD decay was determined for each chromosome and genome [54].

GWAS analysis
Association mapping was implemented using MLM single-locus [26] and mrMLM multi-locus [25]. Briefly, the MLM procedure was conducted by MLM package procedure, while the mrMLM using the mrMLM package in R [25] in two steps. First, all potentially associated SNPs were included in a second model where their impacts were estimated using an empirical Bayes approach. Finally, a likelihood ratio test was used to evaluate all non-zero marker impacts. A significance threshold (cut-off ) of − log 10 (P-value) ≥ 3.0 (P ≤ 0.001) was considered for detecting significant associations in the model. All SNPs meeting this cut-off value were categorized as significant MTAs. GWAS outcomes were summarised using Manhattan plots to visualize associations between accessions and traits in the package GAPIT [55]. In this plot, the x-axis and y-axis present the genomic positions of SNPs and the − log 10 (P-value) obtained from the F-test, respectively. A Q-Q plot was also performed to test the distribution of P-values [56].

Identification of candidate genes
Surrounding all highly significant SNPs, genome sequences were collected and exploited for gene annotation with BLAST versus the IWGSC RefSeq v1.0 genome reference for wheat [50]. After alignment, genes with the highest blast score and identity were included. The biological processes and molecular functions of putative genes were found in Ensembl Plants [http:// plants. ensem bl. org]. The discovery of candidate genes was tested based on being located in the vicinity of the peak marker with a domain 1 Mb.

GP
The genomic prediction in this study was carried out using three models: RR-BLUP [57], GBLUP [58], and BRR [59]. All analyses for GP were implemented using the iPat software [60]. For three sub-populations, 10, 20, and 30% of accessions were assigned randomly to a validation set with the remaining individuals used as the training set. For all of the GP procedures, the whole prediction process was repeated 100 times for each method. The prediction accuracy was represented as the correlation between BLUPs and GEBVs over the validation as well as training sets [12].