Logo of plosonePLoS OneView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
PLoS One. 2011; 6(2): e17279.
Published online 2011 Feb 18. doi:  10.1371/journal.pone.0017279
PMCID: PMC3041829

Genetic Diversity and Linkage Disequilibrium in Chinese Bread Wheat (Triticum aestivum L.) Revealed by SSR Markers

Pär Ingvarsson, Editor


Two hundred and fifty bread wheat lines, mainly Chinese mini core accessions, were assayed for polymorphism and linkage disequilibrium (LD) based on 512 whole-genome microsatellite loci representing a mean marker density of 5.1 cM. A total of 6,724 alleles ranging from 1 to 49 per locus were identified in all collections. The mean PIC value was 0.650, ranging from 0 to 0.965. Population structure and principal coordinate analysis revealed that landraces and modern varieties were two relatively independent genetic sub-groups. Landraces had a higher allelic diversity than modern varieties with respect to both genomes and chromosomes in terms of total number of alleles and allelic richness. 3,833 (57.0%) and 2,788 (41.5%) rare alleles with frequencies of <5% were found in the landrace and modern variety gene pools, respectively, indicating greater numbers of rare variants, or likely new alleles, in landraces. Analysis of molecular variance (AMOVA) showed that A genome had the largest genetic differentiation and D genome the lowest. In contrast to genetic diversity, modern varieties displayed a wider average LD decay across the whole genome for locus pairs with r2>0.05 (P<0.001) than the landraces. Mean LD decay distance for the landraces at the whole genome level was <5 cM, while a higher LD decay distance of 5–10 cM in modern varieties. LD decay distances were also somewhat different for each of the 21 chromosomes, being higher for most of the chromosomes in modern varieties (<5∼25 cM) compared to landraces (<5∼15 cM), presumably indicating the influences of domestication and breeding. This study facilitates predicting the marker density required to effectively associate genotypes with traits in Chinese wheat genetic resources.


Bread wheat (Triticum aestivum L.) is one of the most important cereal crops worldwide, including China. Wheat is grown in 30 of China's 31 provinces in 10 major agro-ecological zones based on wheat type, growing season, and varietal response to temperature and photoperiod [1], [2]. China is also regarded as one of the centers of diversity of common wheat [3]. Due to a long cultivation history and artificial selection in different ecological regions, about 23,135 domesticated accessions (11,694 landraces and 11,441 modern varieties) constitute the Chinese basic collection conserved in the national genebank [http://icgr.caas.net.cn/cgris_english.html]. Recently, a candidate wheat core collection (5,029 accessions) was established based on geographical regions, ecotypes, and 21 agronomic and botanic characters of the basic collections [3]. According to the utility of core collections in crop wild relatives [4], using a strategy for unlocking genetic potential in crops proposed by Tanksley and McCouch [5], both a core collection, with 1,160 accessions (5% of the national collection) representing 91.5% of the genetic diversity, and a mini core collection consisting of 262 accessions with an estimated 70% representation of the genetic variation in the full collection, were constructed based on 4×105 SSR data-points [6]. This mini core collection is a suitable platform for in-depth evaluation, effective utilization and genetic research in Chinese wheat genetic resources [7], [8].

Linkage disequilibrium (LD), or nonrandom association of alleles between loci (linked or unlinked), is becoming increasingly important for identifying genetic regions associated with agronomic traits [9][12]. Recent, genome-wide LD studies were performed on various crop plants, such as maize (Zea mays L.) [13][15], rice (Oryza sativa L.) [16], [17], barley (Hordeum vulgare L.) [18], [19], sorghum (Sorghum bicolor L. Moench) [20], durum wheat (T. turgidum L. var. durum) [21] and soybean (Glycine max L. Merr.) [22], [23]. From an analysis of 242 genomic SSRs among 43 elite US wheat cultivars, Chao et al., [24] reported genome-wide LD estimates of generally less than 1 cM for genetically linked locus pairs, and that most of the LD was between loci less than 10 cM apart. Somers et al., [25] genotyped 189 bread wheat accessions at 370 loci and 93 durum wheat accessions at 245 loci to examine linkage disequilibrium across the genome, and found that LD mapping of wheat can be performed with simple sequence repeats to a resolution of <5 cM. Most of the diversity and LD analyses on wheat were undertaken at the whole genome level, with the exception of two recent studies at the chromosome level [26], [27]. Breseghello and Sorrells [26] found consistent LD of less than 1 cM for chromosome 2D and about 5 cM in the centromeric region of 5A using 33 and 20 SSR markers, respectively. Horvath et al., [27] suggested that chromosome 3B had a lower diversity than average for the entire B-genome; LD was weak in all materials studied, and marker pairs in significant LD were generally concentrated around the centromere in both arms and at distal positions on the short arm. However, all LD studies to date were based on limited numbers of loci and small sample sizes. It would be valuable to estimate LD decay in bread wheat at the whole genome level and with a larger genetic representation of wheat genotypes.

Based on genotyping of 250 accessions mostly from the Chinese bread wheat mini core collection (70% genetic diversity of the initial collection) using 512 microsatellite loci distributed over all 21 chromosomes, the objectives of this study were: 1) to evaluate the allelic diversity within the Chinese wheat collection; 2) to analyze the population structure and compare the diversity level between landraces and modern varieties; 3) to investigate genetic differentiation of wheat genomes within the two gene pools; and 4) to examine the extent and genomic structure of LD between pairs of SSR markers on both genome-wide and chromosome scales. The results of this study should describe the level of genetic diversity and linkage disequilibrium decay of a representative Chinese collection for breeding and genetic research, and provide a molecular basis to enrich genetic diversity of bread wheat worldwide.


Overall Diversity of Chinese Wheat Collections

The genetic characteristics of the 250 member Chinese wheat mini core collection based on 512 microsatellite loci are listed in Table 1. Among 512 SSR loci, 99.4% (509) were polymorphic with just 3 being monomorphic. A total of 6,724 alleles ranging from 1 to 49/locus were detected. PIC values ranged from 0 to 0.967 and the total number of rare alleles with a frequency of less than 5% reached 4,424 (65.8%), indicating that many new alleles occurred in the mini core collection. As expected from the way in which the collection was constructed, the combination of mean genetic richness (13.1) and genetic diversity index (0.650) indicated high levels of polymorphism.

Table 1
Allelic diversity of Chinese wheat collections at 512 whole-genome SSR loci.

Genetic Structure of the Wheat Collection

Population structure of whole collection was investigated using a Bayesian clustering approach, to infer the number of clusters (populations) with STRUCTURE v2.2 software [28]. The structure result at K = 2 was the best separator providing the highest delta k value (Figure 1).

Figure 1
Estimation of the number of populations for K ranging from 1 to 11 by calculating delta K values.

Principal coordinate analysis also indicated two major sup-groups within the mini core collection (Figure 2). A large proportion of accessions formed one sub-group indicated to the left, and the other sub-group (right) included accessions predominantly in the modern cultivar sub-group. The greater scattering of the landrace sub-group indicated its higher diversity. Overlapping between the two gene pools indicated by intermediates in both sub-groups, was probably caused by breeding activities in the 1940s–1950s. During this period new varieties were produced from landrace×introduced cultivar hybrids [2]. Consistent with previous studies [29][31], principal coordinate analysis of the mini core collection clearly indicated that Chinese landraces and modern varieties comprised separate sub-groups of genotypes.

Figure 2
Principal coordinate analysis of 250 Chinese wheat accessions based on 512 microsatellite markers indicating separation of the landrace (red) and modern variety (blue) sub-groups.

Genetic Characteristics within the Landrace and Modern Variety Sub-groups

The basic statistics of genetic diversity between the landrace and modern variety sub-groups at the genome level are listed in Table 2. In total, 6,122 alleles ranging from 1 to 46 were identified at 512 SSR loci in the landrace sub-group, compared with 5,004 alleles ranging from 1 to 35 in the modern variety sub-group. Similarly, private alleles of the landrace were 1,720 (28.1%), but just 602 (12.0%) for modern varieties. Correspondingly, there were 3,833 (62.6%) rare alleles with frequencies <5% for the landrace sub-group whereas this number was 2,788 (55.7%) for modern varieties, indicating higher genetic variation in novel alleles for the landraces than in modern varieties. Allele number per locus (Figure 3A) and PIC per locus (Figure 3B) for all SSR loci were continuously distributed in both landrace and modern varieties. Within the sub-groups allelic numbers per locus ranged from 4 to 13, and PIC values ranged from 0.6 to 0.9, indicating high polymorphism levels in both sub-groups. Both mean genetic richness (12.0) and genetic diversity indices (0.640) of the landraces were higher than those for modern varieties at 9.8 and 0.628, respectively. Consistent with landraces having higher diversity than modern varieties at the whole genome level, these relationships were retained when the three genomes were individually compared. To eliminate the influence of sample size on evaluations of genetic diversity, allelic richness calculated following rarefaction on samples of 68 accessions per sub-group likewise indicated that landraces had higher genetic diversity than modern varieties (Table 2).

Figure 3
Distribution of allele numbers per locus (A) and PIC per locus (B) calculated using rarefacted samples of 68 accessions per sub-group for all SSR loci.
Table 2
Comparison of genetic diversity between the landrace and modern variety sub-groups at the genome level.

A comparative analysis of genetic characteristics between sub-groups was performed at the chromosome level (Table 3). For all chromosomes, the total number of alleles for landraces ranging from 130 to 450 was higher in modern varieties (99 to 361). Again, comparing 68 accessions from each sub-group confirmed that landraces had more alleles per locus than modern varieties at the chromosome level. Except for chromosomes 1B, 5A and 5B, landraces had much higher PIC values than modern varieties for all other chromosomes. The total number of private alleles for landraces ranging from 26 to 140 was higher than that of modern varieties (6 to 49) on all individual chromosomes, as well as the distribution of rare alleles in the two gene pools. The mean Fst value for all chromosomes was 0.021 ranging from 0.010 to 0.035 between landraces and modern varieties. Chromosome 3A (0.035) provided the highest genetic differentiation and 1D (0.010) the lowest (Table 3).

Table 3
Total numbers of alleles, mean PIC values, private alleles and Fst within and between landraces and modern varieties at the chromosome level.

By comparing genetic diversities with the parameter PIC value between the landrace and modern variety sub-groups on homologous group 2 chromosomes (Figure 4), we found that genetic differentiation between them might not be on a genome-wide scale, but rather on selected loci or chromosome intervals, exemplified by the chromosome 2A interval gwm558-gwm312. Within the region, the locus shows a large reduction in diversity with selection as a one of several possible explanations in modern varieties shown by the lower PIC value. Similar comparisons between all chromosomes can be deduced from Figure S2.

Figure 4
PIC distribution of SSR loci between landrace and modern variety sub-groups on chromosomes 2A, 2B and 2D. Blue curve shows PIC changes in modern varieties, and red curve shows changes in the landrace sub-group.

Comparisons of the landrace and modern variety sub-groups on the basis of genomes are shown in Table 4. Evaluated by Shannon's information index (I) and genetic distance (GD), the A genome was the most diverse and the D genome was the least. Gene flow estimated from Fst (Nm) and genetic identity (GI) placed the A genome lowest among the three genomes, while the D genome ranked first. This indicated that the largest genetic differentiation between the two sub-sets was within the A genome with the least differences in the D.

Table 4
Genetic differences between landraces and modern varieties in different genomes.

Analysis of molecular variance (AMOVA) between the landraces and modern varieties by genomes was also carried out (Table 5). All sources of variation were highly significant (P<0.001) and more than 95% of the variance was explained by differences within the A, B and D genomes, whereas only a small part of the overall variance (less than 5%) could be attributed to differences between landraces and modern varieties. The AMOVA analysis also revealed similar structures of genetic differentiation consistent with the basic statistics (Table 3) when comparing the three genomes. The amount of variation in the A genome (4.73%) was higher than that of the B (4.25%) and D (3.05%) genomes again indicating that the A genome had the largest molecular variance and D genome the lowest between the two sub-groups.

Table 5
Analysis of molecular variance (AMOVA) between landraces and modern varieties within genome.

Linkage Disequilibrium at the Whole Genome Level

After deletion of some low frequency alleles (<5%) in both sub-groups, 495 loci were chosen to evaluate the extent of linkage disequilibrium (LD) on a whole genome level in the two wheat gene pools (Table 6). There were 143 (149), 171 (177) and 181 (186) loci on each of the A, B and D genomes available for LD evaluations. Across all 495 loci, 6,171 possible linked locus pairs (in the same linkage groups) and 116,094 unlinked locus pairs (from different linkage groups) could be detected in both sub-groups. Among linked locus pairs, 149 (2.41%) of 4,577 compared were in LD at the P<0.001 level for landraces, whereas there were 275 (4.46%) of 4,736 in significant LD among modern varieties. In addition, the numbers of locus pairs in LD with r2>0.1 or r2>0.2 in modern varieties were also relatively higher than those in landraces. Furthermore, the mean r2 for all significant LD (P<0.001) in modern varieties (0.049, ranging from 0.015 to 0.348) was still larger than for landraces (0.030, ranging from 0.008 to 0.371). Although the landraces possessed more significant unlinked locus pairs (P<0.001) than modern varieties, i.e. 1,509 (1.30%) vs 1,019 (0.88%), modern varieties had higher r2 value in other parameters. LD comparisons on a genome basis showed similar trends of higher LD in modern varieties than in landraces, even though there were very low genome-wide LD levels in both sub-groups. Plots of significant r2 values (P<0.001) between locus pairs in different genomes of the two sub-groups (Figure 5) further supported earlier results.

Figure 5
Plots of significant r2 values (P<0.001) between locus pairs on A, B, D and whole genomes in landraces and modern varieties.
Table 6
SSR locus pairs in significant (P<0.001)linkage disequilibrium (LD) and r2 values between Chinese landraces and modern varieties.

To reveal LD decay distances in the two sub-groups on a whole genome scale, we plotted percentage of locus pairs with significant (P<0.001) LD and mean r2 among distance intervals for each gene pool (Figure 6). The percentage of locus pairs in significant (P<0.001) LD decreased as genetic distance increased, and there were higher scales of significant LD within 10 cM generally. However, mean r2 along distance intervals presented an uneven distribution, i.e. there were some points with relatively higher mean r2 at larger intervals. Considering lower LD values for our samples (Table 6, Figure 6), we determined average LD decay distance in the different genomes for locus pairs with r2>0.05 at P<0.001 in the landrace and modern variety sub-groups (Table 7). Mean LD decay distance for landraces at a whole genome level was <5 cM, with higher LD decay distances in modern varieties for the same genomes. For B, D and the whole genomes, the decay distances were increased to 5–10 cM, but 15–20 cM for the A genome in the modern variety sub-group, which might be caused by demographic history for genome-level changes on modern varieties.

Figure 6
Percentage of locus pairs in significant (P<0.001) LD and mean r2 among distance intervals for A, B, D and whole genomes in the landrace and modern variety sub-groups.
Table 7
Average LD decay distance in different genomes for locus pairs with r2>0.05 at P<0.001 in landraces and modern varieties.

Linkage Disequilibrium at the Chromosome Level

After scanning the extent and structure of LD between landraces and modern varieties on a whole genome scale, the same evaluations were performed at the single chromosome level based on 495 SSR loci in the two gene pools (Tables 8 and and9).9). Comparing SSR locus pairs in significant (P<0.001) LD and mean r2 values between landraces and modern varieties (Table 8), the number of mean locus pairs in significant LD was 7.1 (2.41%) ranging from 1 to 30 for the landrace sub-group, and 13.1 (4.47%) ranging from 2 to 35 in the modern variety sub-group. Correspondingly, mean r2 of the landrace sub-group was only 0.033 ranging from 0.011 to 0.140, whereas in the modern variety sub-group it was 0.053 ranging from 0.026 to 0.194. At the individual chromosome level, except for chromosomes 1A and 6D, the modern varieties had more SSR locus pairs in significant LD. Nevertheless, the mean r2 for modern varieties was still larger than for landraces for all chromosomes except 4A and 4D. Therefore, compared with the landrace sub-group, the modern variety gene pool still had higher numbers of SSR locus pairs in significant LD and higher mean r2 values for almost all wheat chromosomes. However, these parameters were not compared among all chromosomes within the same gene pool because of a big difference of loci selected on each chromosome in the present study.

Table 8
SSR locus pairs in significant (P<0.001) LD on all 21 chromosomes and r2 values between Chinese landraces and modern varieties.
Table 9
Average LD decay distance in different chromosomes in the landrace and modern variety sub-groups for locus pairs with r2>0.05 at P<0.001.

Average LD decay distances on different chromosomes for locus pairs with r2>0.05 at P<0.001 in the two sub-groups are depicted in Table 9. It was interesting that LD decay distance was <5 cM for 19 of 21 chromosomes in the landrace sub-group, but 5–10 cM for 2B and 10–15 cM for 5A. In the modern variety sub-group, chromosomes 1B, 1D, 2D, 3A, 3B, 3D, 4A, 4B, 6B, 6D, 7B had <5 cM LD decay distances similar to the landraces, but the other 10 chromosomes showed wider LD decay distances than those of the landraces, especially the values 20–25 cM for 5A and 7D. These general descriptions of LD decay distance provide important information concerning decisions on marker densities for future association analyses at the chromosome level, and also guidance on different strengths of selective signals in breeding imprinted on each chromosome.


Genetic Relationship and Population Structure

In our previous studies, 43 cornerstone breeding parents used before 1980 and widely grown varieties in current use in China [29], 96 random samples with maximized genetic diversity [30], a 340 candidate core collection from the Northwestern Spring Wheat Region [31], and a 1,110 member Chinese core collection [6], consistently demonstrated that Chinese landraces and modern varieties are relatively independent genetic sub-groups.

To address possible limitations in the number of loci used in above-mentioned studies, we employed 512 microsatellite loci identifying 6,724 alleles to obtain a genetic structure of Chinese wheat genetic resources using principal coordinate analysis and Bayesian clustering approaches. The larger number of alleles identified in 512 SSR loci also indicated that individual microsatellite loci have higher information content [32][34]. Using a relatively large set of molecular data-points, the Chinese mini core collection was divided into two major sub-groups basically, landraces and modern varieties. This was considered consistent with the history of Chinese wheat breeding. Within each sub-group there were some intermediate genotypes. Adopting with a threshold probability >0.50 to fitting one of the clusters [24], [26], 78 of 93 modern varieties were clearly assigned to one sub-group and 135 of 157 landraces to the other. Examples of the 37 varieties with a lower probability (<0.50) of fitting either sub-group included Lianglaiyoubaipixiaomai (Inner Mongolia), Bihongsui (Inner Mongolia), Mingxian 169 (Shanxi), Shite 14 (Hebei), Fuzhuang 30 (Shaanxi), and Jingyang 60 (Shaanxi). Even though they were arbitratrily classified into modern varieties, most of them were selections of landraces or were from hybrid progeny of landraces [2], [6], and still retained most of the genetic characteristics of landraces.

Genetic Diversity in Chinese Wheat Gene Pools

Allelic diversity analysis in this study revealed that the total number of alleles amplified at 512 SSR loci in 250 accessions was up to 6,724 (13.1 alleles per locus on average, ranging from 1 to 49), and polymorphism information content values ranged from 0 to 0.967 (mean 0.650). These values were higher than the previously reported estimates of SSR marker diversity in wheat [24][26], [35], [36]. And, allele number was ranged from 4.81 to 10.5 and mean PIC value from 0.46 to 0.62 for above-mentioned studies. On the other hand, a genetic diversity of 0.77 and 18.1 alleles [37], 14.5 alleles and a genetic diversity of 0.662 [38], and, 23.9 alleles per locus over 38 SSR markers [39] were also reported. Comparatively, the high SSR allele diversity found in the minicore collection approximately reflects the genetic representation of the entire set of Chinese wheat collections. It is very interesting that there were a total of 4,424 alleles with frequencies of less than 5% among all accessions, and these so-called rare alleles represented 65.8% of all alleles detected. Similar with common alleles, rare variants or new alleles unselected artificially also played an important role in genome-wide genetic research [40].

The amounts of genetic diversity in the two gene pools and PIC values were significantly different at both the genome (Table 2) and individual chromosome (Table 3) levels, in terms of allelic richness calculated using equivalent numbers of accessions from each sub-group. Results of allelic diversity using 512 SSR markers indicated that the landraces (mean genetic richness: 12.0; genetic diversity index: 0.640; allelic richness: 10.7) actually had higher genetic diversity than modern varieties (mean genetic richness: 9.8; genetic diversity index: 0.628; allelic richness: 9.5). This was consistent with a previous study analyzing 1,160 a Chinese wheat core collection composed of 762 landraces and 348 modern varieties using 78 microsatellite markers [6]. Like the whole genome, similar results were obtained for individual genomes and chromosomes. This implied there were more potentially rare variants or new alleles in the landrace gene pool. Obviously, these could be of value for genetic research or breeding.

China has a more than four millennia history of wheat cultivation, and landraces became isolated because of limited transportation in earlier times [6]. Scientific breeding in China can be traced back only 50–90 years [2]. The history of Chinese wheat breeding shows that new varieties were usually selected from landraces in the early period, later from crosses between landraces and introduced varieties, and more recently from crosses between Chinese modern varieties. In this study, genetic analyses including Shannon's information index (I), genetic distance (GD), genetic differentiation coefficient (Fst), and analysis of molecular variance (AMOVA) between the landrace and modern variety sub-groups for different genomes suggested that the A genome (4.73%) was significantly more variable than the B (4.25%) and D (3.05%) genomes, indicating stronger selective pressure on the A genome during Chinese wheat breeding. However, a selection sweep imprinted across genomes suggested that some important loci or chromosomal intervals rather than whole genomes (or chromosomes) were responsible for the differences (Figure 4). This is consistent with findings in sunflower reported by Chapman et al., [41].

LD Level in Chinese Bread Wheat

A number of LD mapping studies in wheat were performed at the genome or chromosome levels [24][27], so it is important to examine the extent of LD in Chinese wheat genomes. This determines the genetic distances over which LD will decay back to a random association of alleles and facilitates prediction of marker density needed to effectively associate genotypes with traits [11]. In the present study 512 SSR loci with a mean marker density of 5.1 cM per locus, ranging from 2.2 to 9.4 cM for all 21 chromosomes, were used to measure LD in Chinese wheat genetic resources at both the genome and chromosome levels (Tables 69,9, Figures. 5 and and66).

Population structure is one of several important factors that have strong influences on LD, besides recombination, mutation, population size, genetic drift, population mating pattern, admixture, and selection [10]. The presence of population stratification and an unequal distribution of alleles within groups can result in nonfunctional, spurious associations [42]. In our LD estimations, we took into account the effect of population structure by subdividing the genetic resources into two main gene pools, i.e. Chinese landraces and modern varieties, which were validated by the results of genetic structure with the software STRUCTURE v2.2 [28] (Figure 1) and principal coordinate analysis using NTSYS-pc version 2.1 software [43] (Figure 2).

LD decay distance with r2>0.05 at P<0.001 was consistent for all linked locus pairs in each gene pool. Mean LD decay distance for the landraces at the whole genome level was <5 cM, whereas higher values applied in the modern varieties. In detail, as to B, D and whole genomes, the decay distance increased to 5–10 cM, and 15–20 cM for A genome in the modern variety sub-group, possibly due to demographic history for genome-level changes [44]. At a chromosome level, LD decay distance was <5 cM for 19 of the 21 chromosomes in the landrace sub-group, but 5–10 cM for 2B and 10–15 cM for chromosome 5A. As for the modern variety sub-group, chromosomes 1B, 1D, 2D, 3A, 3B, 3D, 4A, 4B, 6B, 6D, and 7B had <5 cM values similar to the landraces, but the other 10 chromosomes showed wider LD decay distances extending to 20–25 cM for 5A and 7D. This indicated that these two chromosomes may carry more QTLs or genes related to important agronomic traits that were strongly selected in breeding [12]. Our results further demonstrated population-dependent and genome-dependent LD characteristics in comparison with genome-wide LD estimates of less than 1 cM in 43 US wheat elite cultivars [24], <5 cM for LD decay distance across the genome among 189 bread wheat accessions from western Canadian wheat breeding programs [25], less than 1 cM on chromosome 2D and about 5 cM in the centromeric region of 5A in 95 soft winter wheat from the eastern United States [26].

In general, a significance level of P<0.001 was adopted as a comparison threshold. Thus, Somers et al., [25] found that bread and durum wheat collections had 47.9% and 14.0% of all locus pairs in significant LD, but within the groups only 0.9% (bread wheat) and 3.2% (durum wheat) of locus pairs were in LD with r2>0.2. Malysheva-Otto et al., [19] also showed that 100% of locus pairs in significant LD based on r2>0.05 at P<0.001 in a wide set of barley varieties, but this fell to 45% in a European 2-rowed spring barley subgroup. This indicated that the number of loci in significant LD, as well as the extent of LD, was clearly dependent on the population structure and on different genomes. In the present study, 2.41% and 4.46% of locus pairs for the Chinese landrace and modern variety gene pools were in significant LD (P<0.001) on a whole genome level, but only 0.02% and 0.03% were in LD based on r2>0.2 within the two gene pools (Table 6). The extremely low values of LD in the two clusters can be seen as evidence that many of the recombination events in past breeding history have been maintained and fixed in homozygous self-fertilizing bread wheat, as well as a reflection of the higher genetic diversity that is maintained in the mini core collection in Chinese wheat genetic resources. Understanding the patterns of LD across the genome will facilitate prediction of marker densities required for efficient association of genotypes with traits in Chinese wheat genetic resources at both the genome and chromosome levels.

Materials and Methods

Plant Materials

A total of 250 wheat accessions were used in the present study. These included 93 modern varieties and 157 landraces (), among which 245 (98%) were from the Chinese wheat mini core collection constructed in our group [6], [31]. This collection representing just 1% of the national collection has more than 70% of its genetic diversity.

Microsatellite Analysis

Genomic DNA of all materials was extracted using lyophilized pooled young leaves of ten seedlings following Sharp et al., [45]. A total of 512 pairs of SSR primers with good genome coverage were selected to genotype the collection. The primers comprised 212 GWM [46], 114 BARC [47], 89 WMC [48], 66 CFD, 18 CFA plus 2 GPW [INRA, http://wheat.pw.usda.gov/ggpages/SSRclub/], 10 GDM [49] and 1 PSP primer set. Genetic distance (cM) of each locus on a consensus map obtained from the Komugi wheat genetic resources database [http://www.shigen.nig.ac.jp/wheat/komugi/top/top.jsp] (Table S2, Figure S1). In total, these SSR loci covered 2,631.3 cM with a mean genetic distance of 5.1 cM between adjacent loci (Table S3). More information concerning these wheat microsatellite markers is available in the GrainGenes 2.0 database [http://wheat.usda.gov/GG2/index.shtml]. Fluorescence-labeled primers that were relatively evenly distributed on the 21 wheat chromosomes were synthesized at Applied Biosystems Company. An ABI 3730 Analyzer (Applied Biosystems) was used to capture amplification products by a fluorescence detection system for microsatellite markers. More detailed experimental procedures are given in Hao et al., [31]. Fragment sizes were evaluated using GeneMapper v3.7 software (Applied Biosystems), and the molecular data-points for all SSR markers are listed in Table S4.

Data Analysis

Population structure analysis for the 250 Chinese wheat accessions was performed using the molecular datasets of 512 whole-genome SSR markers with STRUCTURE v2.2 software [28]. We adopted the “admixture model”, burn-in period equal to 50,000 iterations and a run of 100,000 replications of Markov Chain Monte Carlo (MCMC) after burn in. For each run, 5 independent runs of STRUCTURE were performed with the number of clusters (K) varying from 1 to 11, leading to 55 Structure outputs. We then estimated the number of subpopulations and the best output on the basis of the Evanno criterion [50].

Genetic dissimilarities between accessions were calculated using the simple matching coefficient in DARwin software [51]. Cluster analysis and dendrogram tree construction were performed based on dissimilarity matrices with the un-weighted pair-group method using arithmetic averages (UPGMA). Principal coordinate analysis was also used to reveal the relationships among the 250 accessions based on the above dissimilarity matrices, with the help of NTSYS-pc version2.1 software [43].

Basic statistics of genetic diversity including total number of alleles, and polymorphism information content (PIC) at each SSR locus according to the formula PIC = 1-∑pi2 [52] where pi is the frequency of the ith allele, were carried out with PowerMarker v3.25 [53]. Genetic differentiation between landraces and modern varieties on a genome basis was detected with POPGENE software [54] using coefficients gene flow (Nm), genetic distance (GD), genetic identity (GI), Shannon's information index (I) and coefficient of gene differentiation (Fst). The genetic variation within and among populations of wheat accessions for different genomes was evaluated using analysis of molecular variance (AMOVA) implemented in Arlequin v3.11 software [55]. Due to the different sample sizes of the two sub-groups, an allele rarefaction method was used to standardize the allelic richness of samples [56].

Linkage disequilibrium (LD) between markers, including the pairwise estimated squared allele-frequency correlations (r2) and significance of each pair of loci [57], was calculated with the dedicated procedure of the TASSEL software [58]. In the process of LD estimation, SSR datasets were filtered for rare alleles with frequencies of less than 5% in the whole collection and computed using 100,000 permutations.

Supporting Information

Figure S1

Consensus genetic maps showing positions of the 512 SSR loci studied [source: http://www.shigen.nig.ac.jp/wheat/komugi/top/top.jsp]. Numbers on the left are genetic distances in centiMorgan.


Figure S2

Comparative PIC distributions of SSR loci in the landrace and modern variety sub-groups for all 21 wheat chromosomes. Blue curves show PIC trends in the modern variety sub-group, and red curves show trends for the landrace sub-group. Blue broken line means average PIC value of all SSR loci for the modern variety, and red broken line for the landrace. The detailed mean PIC values are listed at the bottom of each broken line. Genetic positions (cM) of SSR loci are from the Komugi wheat genetic resources database [http://www.shigen.nig.ac.jp/wheat/komugi/top/top.jsp].


Table S1

Detailed information of 250 accessions used in the study.


Table S2

List of 512 SSR loci used in the study.


Table S3

Chromosomal distribution of SSR loci used in the study.


Table S4

Molecular database of 512 SSR loci in 250 Chinese wheat genetic resources.



The authors are grateful to Ms. HN Zhang, YH Tian, J Lin and YJ Wang for excellent genotyping of the mini core collection, and to Dr. M Ren for help on data analysis. We also gratefully appreciated help from Prof. Robert A McIntosh, University of Sydney, with English editing.


Competing Interests: The authors have declared that no competing interests exist.

Funding: This work was supported by the Chinese Ministry of Science and Technology (2010CB125900, 2004CB117202), Modern Agricultural Technical System, and National Natural Science Foundation of China (30900898). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


1. He ZH, Rajaram S, Xin ZY, Huang GZ, editors. Mexico, D.F. : CIMMYT; 2001. A history of wheat breeding in China.
2. Zhuang QS. Beijing: China Agricultural Press; 2003. Chinese Wheat improvement and pedigree analysis (in Chinese).
3. Dong YS, Cao YS, Zhang XY, Liu SC, Wang LF, et al. Establishment of candidate core collections in Chinese common wheat germplasm. J Plant Genet Resour (in Chinese with English abstract) 2003;4:1–8.
4. Schoen DJ, Brown AH. Conservation of allelic richness in wild crop relatives is aided by assessment of genetic markers. Proc Natl Acad Sci USA. 1993;90:10623–10627. [PMC free article] [PubMed]
5. Tanksley SD, McCouch SR. Seed bank and molecular maps: Unlocking genetic potential from the wild. Science. 1997;277:1063–1066. [PubMed]
6. Hao CY, Dong YC, Wang LF, You GX, Zhang HN, et al. Genetic diversity and construction of core collection in Chinese wheat genetic resources. Chinese Science Bulletin. 2008;53:1518–1526.
7. Wang J, Sun JZ, Liu DC, Yang WL, Wang DW, et al. Analysis of Pina and Pinb alleles in the micro-core collections of Chinese wheat germplasm by ecotilling and identification of a novel Pinb allele. Journal of Cereal Science. 2008;48:836–842.
8. Guo ZA, Song YX, Zhou RH, Ren ZL, Jia JZ. Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene. New Phytol. 2010;185:841–851. [PubMed]
9. Hedrick PW. Gametic disequilibrium measures: proceed with caution. Genetics. 1987;117:331–341. [PMC free article] [PubMed]
10. Flint-Garcia SA, Thornsberry JM, Buckler ES. Structure of linkage disequilibrium in plants. Annu Rev Plant Biol. 2003;54:357–374. [PubMed]
11. Rafalski A, Morgante M. Corn and humans: recombination and linkage disequilibrium in two genomes of similar size. Trends Genet. 2004;20:103–111. [PubMed]
12. Zhang XY, Tong YP, You GX, Hao CY, Ge HM, et al. Hitchhiking effect mapping: A new approach for discovering agronomic important genes. Agricultural Sciences in China. 2007;6:255–264.
13. Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR, et al. Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc Natl Acad Sci USA. 2001;98:11479–11484. [PMC free article] [PubMed]
14. Wang RH, Yu YT, Zhao JR, Shi YS, Song YC, et al. Population structure and linkage disequilibrium of a mini core set of maize inbred lines in China. Theor Appl Genet. 2008;117:1141–1153. [PubMed]
15. Yan JB, Shah T, Warburton ML, Buckler ES, McMullen MD, et al. Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS ONE. 2009;4:e8451. [PMC free article] [PubMed]
16. Garris AJ, McCouch SR, Kresovich S. Population structure and its effect on haplotype diversity and linkage disequilibrium surrounding the xa5 locus of rice (Oryza sativa L.). Genetics. 2003;165:759–769. [PMC free article] [PubMed]
17. Mather KA, Caicedo AL, Polato NR, Olsen KM, McCouch S, et al. The extent of linkage disequilibrium in rice (Oryza sativa L.). Genetics. 2007;177:2223–2232. [PMC free article] [PubMed]
18. Kraakman ATW, Niks RE, Van den Berg PMMM, Starn P, Van Eeuwijk FA. Linkage disequilibrium mapping of yield and yield stability in modern spring barley cultivars. Genetics. 2004;168:435–446. [PMC free article] [PubMed]
19. Malysheva-Otto LV, Ganal MW, Röder MS. Analysis of molecular diversity, population structure and linkage disequilibrium in a worldwide survey of cultivated barley germplasm (Hordeum vulgare L.). BMC Genet. 2006;7:6. [PMC free article] [PubMed]
20. Hamblin MT, Mitchell SE, White GM, Gallego J, Kukatla R, et al. Comparative population genetics of the Panicoid grasses: Sequence polymorphism, linkage disequilibrium and selection in a diverse sample of Sorghum bicolor. Genetics. 2004;167:471–483. [PMC free article] [PubMed]
21. Maccaferri M, Sanguineti MC, Noli E, Tuberosa R. Population structure and long-range linkage disequilibrium in a durum wheat elite collection. Mol Breed. 2005;15:271–289.
22. Zhu YL, Song QJ, Hyten DL, Van Tassell CP, Matukumalli LK, et al. Single-nucleotide polymorphisms in soybean. Genetics. 2003;163:1123–1134. [PMC free article] [PubMed]
23. Li YH, Guan RX, Liu ZX, Ma YS, Wang LX, et al. Genetic structure and diversity of cultivated soybean (Glycine max (L.) Merr.) landraces in China. Theor Appl Genet. 2008;117:857–871. [PubMed]
24. Chao S, Zhang WJ, Dubcovsky J, Sorrells M. Evaluation of genetic diversity and genome-wide linkage disequilibrium among U.S. wheat (Triticum aestivum L.) germplasm representing different market classes. Crop Sci. 2007;47:1018–1030.
25. Somers DJ, Banks T, DePauw R, Fox S, Clarke J, et al. Genome-wide linkage disequilibrium analysis in bread wheat and durum wheat. Genome. 2007;50:557–567. [PubMed]
26. Breseghello F, Sorrells ME. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics. 2006;172:1165–1177. [PMC free article] [PubMed]
27. Horvath A, Didier A, Koenig J, Exbrayat F, Charmet G, et al. Analysis of diversity and linkage disequilibrium along chromosome 3B of bread wheat (Triticum aestivum L.). Theor Appl Genet. 2009;119:1523–1537. [PubMed]
28. Pritchard JK, Stephens M, Donnelly P. Inference of population structure from multilocus genotype data. Genetics. 2000;155:945–959. [PMC free article] [PubMed]
29. Zhang XY, Li CW, Wang LF, Wang HM, You GX, et al. An estimation of the minimum number of SSR alleles needed to reveal genetic relationships in wheat varieties. I. Information from large-scale planted varieties and corner-stone breeding parents in Chinese wheat improvement and production. Theor Appl Genet. 2002;106:112–117. [PubMed]
30. You GX, Zhang XY, Wang LF. An estimation of the minimum number of SSR loci needed to reveal genetic relationships in wheat varieties: Information from 96 random samples with maximized genetic diversity. Mol Breed. 2004;14:397–406.
31. Hao CY, Zhang XY, Wang LF, Dong YS, Shang XW, et al. Genetic diversity and core collection evaluations in common wheat germplasm from the Northwestern Spring Wheat Region in China. Mol Breed. 2006;17:69–77.
32. Rosenberg NA, Li LM, Ward R, Pritchard JK. Informativeness of genetic markers for inference of ancestry. American J Human Genet. 2003;73:1402–1422. [PMC free article] [PubMed]
33. Payseur BA, Jing P. A genome-wide comparison of population structure at STRPs and nearby SNPs in humans. Mol Biol and Evol. 2009;26:1369–1377. [PMC free article] [PubMed]
34. Li YH, Li W, Zhang C, Yang L, Chang RZ, et al. Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and single-nucleotide polymorphism loci. New Phytol. 2010;188(1):242–253. [PubMed]
35. Röder MS, Wendehake K, Korzun V, Bredemeijer G, Laborie D, et al. Construction and analysis of a microsatellite-based database of European wheat varieties. Theor Appl Genet. 2002;106:67–73. [PubMed]
36. Dreisigacker S, Zhang P, Warburton ML, Van Ginkel M, Hoisington D, et al. SSR and pedigree analyses of genetic diversity among CIMMYT wheat lines targeted to different megaenvironments. Crop Sci. 2004;44:381–388.
37. Huang XQ, Börner A, Röder MS, Ganal MW. Assessing genetic diversity of wheat (Triticum aestivum L.) germplasm using microsatellite markers. Theor Appl Genet. 2002;105:699–707. [PubMed]
38. Roussel V, Koenig J, Beckert M, Balfourier F. Molecular diversity in French bread wheat accessions related to temporal trends and breeding programmes. Theor Appl Genet. 2004;108:920–930. [PubMed]
39. Balfourier F, Roussel V, Strelchenko P, Exbrayat-Vinson F, Sourdille P, et al. A worldwide bread wheat core collection arrayed in a 384-well plate. Theor Appl Genet. 2007;114:1265–1275. [PubMed]
40. Dickson SP, Wang K, Krantz L, Hakonarson H, Goldstein DB. Rare variants create synthetic genome-wide associations. PLoS Biol. 2010;8:e1000294. [PMC free article] [PubMed]
41. Chapman MA, Pashley CH, Wenzler J, Hvala J, Tang SX, et al. A genomic scan for selection reveals candidates for genes Involved in the evolution of cultivated sunflower (Helianthus annuus). Plant Cell. 2008;20:2931–2945. [PMC free article] [PubMed]
42. Knowler WC, Williams RC, Pettitt DJ, Steinberg AG. Gm3-5, 13, 14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am J Hum Genet. 1988;43:520–526. [PMC free article] [PubMed]
43. Rohlf FJ. NTSYS-pc: numerical taxonomy and multivariate analysis system, version 2.1. 2000 Exeter Software, Setauket, N.Y.
44. Cavalli-Sforza LL. Population structure and human evolution. Proc R Soc Lond B Biol Sci. 1966;164:362–379. [PubMed]
45. Sharp PJ, Chao S, Desai S, Gale MD. The isolation, characterization and application in Triticeae of a set of wheat RFLP probes identifying each homoeologous chromosome arm. Theor Appl Genet. 1989;78:342–348. [PubMed]
46. Röder MS, Korzun V, Wendehake K, Plaschke J, Tixier MH, et al. A microsatellite map of wheat. Genetics. 1998;149:2007–2023. [PMC free article] [PubMed]
47. Gupta PK, Balyan HS, Edwards KJ, Isaac P, Korzun V, et al. Genetic mapping of 66 new microsatellite (SSR) loci in bread wheat. Theor Appl Genet. 2002;105:413–422. [PubMed]
48. Somers DJ, Isaac P, Edwards K. A high-density microsatellite consensus map for bread wheat (Triticum aestivum L.). Theor Appl Genet. 2004;109:1105–1114. [PubMed]
49. Pestsova E, Ganal MW, Röder MS. Isolation and mapping of microsatellite markers specific for the D genome of bread wheat. Genome. 2000;43:689–697. [PubMed]
50. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology. 2005;14:2611–2620. [PubMed]
51. Perrier X, Flori A, Bonnot F. Data analysis methods. In: Hamon P, Seguin M, Perrier X, Glaszmann JC, editors. Genetic diversity of cultivated tropical plants. Montpellier, France: Enfield, Science Publishers; 2003. pp. 43–76.
52. Nei M. Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci USA. 1973;70:3321–3323. [PMC free article] [PubMed]
53. Liu K, Muse SV. PowerMarker: integrated analysis environment for genetic marker data. Bioinformatics. 2005;21:2128–2129. [PubMed]
54. Yeh FY, Boyle R, Ye T, Mao Z. Alberta, Canada: Molecular Biology and Biotechnology Centre, University of Alberta; 1997. POPGENE, the user-friendly shareware for population genetic analysis, version 1.31.
55. Excoffier L, Laval G, Schneider S. Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online. 2005;1:47–50. [PMC free article] [PubMed]
56. Petit RJ, El Mousadik A, Pons O. Identifying populations for conservation on the basis of genetic markers. Conserv Biol. 1998;12:844–855.
57. Gaut BS, Long AD. The lowdown on linkage disequilibrium. Plant Cell. 2003;15:1502–1506. [PMC free article] [PubMed]
58. Zhang ZW, Bradbury PJ, Kroon DE, Casstevens TM, Buckler ES.2006. TASSEL 2.0: a software package for association and diversity analyses in plants and animals ( www.maizegenetics.net). In Plant & Animal Genomes XIV Conference, Poster P956/CP012, San Diego, USA.

Articles from PLoS ONE are provided here courtesy of Public Library of Science
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...