• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. Dec 2009; 183(4): 1525–1534.
PMCID: PMC2787436

Comparison of Mating Designs for Establishing Nested Association Mapping Populations in Maize and Arabidopsis thaliana

Abstract

The nested association mapping (NAM) strategy promises to combine the advantages of linkage mapping and association mapping. The objectives of my research were to (i) investigate by computer simulations the power and type I error rate for detecting quantitative trait loci (QTL) with additive effects using recombinant inbred line (RIL) populations of maize derived from various mating designs, (ii) compare these estimates to those obtained for RIL populations of Arabidopsis thaliana, (iii) examine for both species the optimum number of inbreds used as parents of the NAM populations, and (iv) provide on the basis of the results of these two model species a general guideline for the design of NAM populations in other plant species. The computer simulations were based on empirical data of a set of 26 diverse maize inbred lines and a set of 20 A. thaliana inbreds both representing a large part of the genetic diversity of the corresponding species. I observed considerable differences in the power for QTL detection between NAM populations of the same size but created on the basis of different crossing schemes. This finding illustrated the potential to improve the power for QTL detection without increasing the total resources necessary for a QTL mapping experiment. Furthermore, my results clearly indicated that it is advantageous to create NAM populations from a large number of parental inbreds.

MANY traits that are important for fitness and agricultural value of plants are quantitative traits. Such traits are affected by many genes, the environment, and interactions between genes and the environment (Holland 2007). In plants, quantitative trait locus (QTL) mapping is a key tool for studying the genetic architecture of quantitative traits (Yano 2001). This method enables the estimation of (i) the number of genome regions affecting a trait, (ii) the distribution of gene effects, and (iii) the relative importance of additive and nonadditive gene action.

Until now, most of the plant QTL mapping studies have been based on linkage mapping methods using individual biparental populations. The major limitations of such approaches are a poor resolution in detecting QTL and that with biparental crosses of inbred lines only two alleles at any given locus can be studied simultaneously (Flint-Garcia et al. 2005). Association mapping methods, which are successfully applied in human genetics to detect genes coding for human diseases (e.g., Willer et al. 2008), promise to overcome these limitations (Kraakman et al. 2004). However, in comparison with linkage mapping approaches, association mapping approaches have only a low power to detect QTL in genomewide scans (Yu and Buckler 2006).

The nested association mapping (NAM) strategy proposed by Yu et al. (2008) uses recombinant inbred line (RIL) populations derived from several crosses of parental inbreds. Due to diminishing chances of recombination over short genetic distance and a given number of generations, the genomes of these RILs are mosaics of chromosomal segments of their parental genomes. Consequently, within the chromosomal segments, the linkage disequilibrium (LD) information across the parental inbreds is maintained. Thus, if diverse parental inbreds are used, LD decays within the chromosomal segments of the RILs over a short physical distance (Wilson et al. 2004). Therefore, the NAM strategy allows to exploit both recent and ancient recombination and, thus, will show a high mapping resolution (Yu et al. 2008). Furthermore, due to the balanced design underlying the proposed mapping strategy as well as the systematic reshuffling of the genomes of the parental inbreds during RIL development, NAM populations are expected to show a high power to detect QTL in genomewide approaches (Buckler et al. 2009).

Exploitation of the advantages of the NAM strategy requires developing, genotyping, and phenotyping of RIL populations from several crosses of diverse parental inbreds. This, however, requires large financial resources (cf. Yu et al. 2008). Therefore, it is mandatory that the available resources are spent in an optimum way.

Stich et al. (2009) examined the optimum allocation of resources for NAM in maize with respect to the number of RILs derived from the reference design as well as the number of environments and replications per environment used for phenotypic evaluation. The power for QTL detection, however, is expected to be influenced not only by these factors but also by the crossing scheme from which RIL populations are derived. To my knowledge, no study has so far compared RIL populations derived from various mating designs regarding the power for detecting QTL with additive effects. Furthermore, no information is available on the optimum number of inbreds used as parents of the NAM populations.

For Arabidopsis thaliana, more advanced genomic tools are available than for most other plant species (e.g., Alonso et al. 2003; Clark et al. 2007). This fact increases the prospects of success of NAM approaches. However, A. thaliana differs from maize with respect to the genome size and the allele frequency, which both have the potential to influence the power for QTL detection. Nevertheless, to my knowledge, no study has so far examined the power of NAM in A. thaliana.

The objectives of my research were to (i) investigate by computer simulations the power and type I error rate for detecting QTL with additive effects using RIL populations of maize derived from various mating designs, (ii) compare these estimates to those obtained for RIL populations of A. thaliana, (iii) examine for both species the optimum number of inbreds used as parents of the NAM populations, and (iv) provide on the basis of the results of these two model species a general guideline for the design of NAM populations in other plant species.

MATERIALS AND METHODS

Simulations:

The computer simulations were based on empirical data of 653 single-nucleotide polymorphism (SNP) markers of 26 diverse maize inbred lines, namely B73, B97, CML52, CML69, CML103, CML228, CML247, CML277, CML322, CML333, Hp301, IL14H, Ki3, Ki11, Ky21, M37W, M162W, Mo18W, MS71, NC350, NC358, Oh7b, Oh43, P39, Tx303, and Tzi8 (Yu et al. 2008). These inbreds were selected on the basis of 100 simple sequence repeat markers from a worldwide sample of 260 maize inbreds to capture the maximum genetic diversity (Liu et al. 2003). Details about SNP discovery, detection, and mapping were described by Yu et al. (2008).

Furthermore, I used for my study empirical data of 653 SNP markers of 20 A. thaliana inbreds, namely Bay-0, Bor-4, Br-0, Bur-0, C24, Col-0, Cvi-0, Est-1, Fei-0, Got-7, Ler-1, Lov-5, Nfa-8, Rrs-7, Rrs-10, Sha, Tamm-2, Ts-1, Tsu-1, and Van-0 (Clark et al. 2007). These inbreds were selected on the basis of polymorphisms in 876 genomewide distributed fragments from a sample of 96 A. thaliana genotypes to capture the maximum genetic diversity (Nordborg et al. 2005). The 653 SNP markers were selected from a set of 648,570 nonredundant SNP markers (MBML2 data set; Clark et al. 2007; Kim et al. 2007; ftp://ftp.arabidopsis.org/Polymorphisms/Perlegen_Array_Resequencing_Data_2007/SNP_predictions/) to uniformly cover the chromosomes (supporting information, File S1). Genetic map positions for these SNPs were lacking. Therefore, a linear model was applied to project the physical map position of the SNPs on the genetic map of Singer et al. (2006).

Mating designs evaluated:

The I = 26 maize inbreds and the I = 20 A. thaliana inbreds were used to examine 10 different mating designs using computer simulations (Figure S1). RILs were derived from the crosses of each design, where each RIL was assumed to be derived from a distinct F2 plant through single-seed descent with selfing to the F6 generation.

For the reference (REF) design in maize, RIL populations were derived from the crosses between B73 and the 25 diverse inbreds (McMullen et al. 2009) (Table 1). In A. thaliana, RIL populations were derived from the crosses between Col-0 and the 19 inbred lines. For the diallel (DIA) design, a RIL population was derived from each of the crosses in the diallel (method 4; Griffing 1956) among all I maize or A. thaliana parental inbreds. For the factorial (FCT) design, the 26 maize inbreds or the 20 A. thaliana inbreds were randomly partitioned into two subsets of equal size and a RIL population was derived from each cross between the two sets of inbreds (Comstock and Robinson 1948).

TABLE 1
Number of crosses NC underlying the segregating populations derived from reference (REF), diallel (DIA), factorial (FCT), single round-robin (SRR), double round-robin (DRR), reduced round-robin (RRR), independent round-robin (IRR), and distance-based ...

For the single round-robin (SRR) design (Verhoeven et al. 2006), RILs were derived from each of the chain crosses, i.e., inbred 1 × inbred 2, inbred 2 × inbred 3, … , inbred I × inbred 1. For the double round-robin (DRR) design, a RIL population was derived from each of the double-chain crosses, i.e., inbred 1 × inbred 2, inbred 1 × inbred 3, inbred 2 × inbred 3, inbred 2 × inbred 4, … , inbred I × inbred 1. For reduced round robin (RRR), RILs were derived from the reduced double-chain crosses, i.e., inbred 1 × inbred 2, inbred 1 × inbred 3, inbred 2 × inbred 3, inbred 3 × inbred 4, inbred 5 × inbred 6, … , inbred (I − 1) × inbred I. For the independent round-robin (IRR) design, a RIL population was derived from each of the independent chain crosses among the I inbreds, i.e., inbred 1 × inbred 2, inbred 3 × inbred 4, … , inbred (I − 1) × inbred I.

The data sets for the distance-based designs DBp were established by selecting from all crosses in a diallel among the I inbreds the p% combinations of parental inbreds, which show, on the basis of all marker loci, the maximum genetic dissimilarity (Nei and Li 1979). In the current study, the designs DB15, DB30, and DB60 were examined.

In addition to the above-described simulations with I = 26 and 20 parental inbreds for maize and A. thaliana, respectively, I examined exemplarily for the REF and DIA designs scenarios with the same total number of RILs N derived from crosses between a reduced number of parents I: For maize, I set I = 23, 20, … , 5 and for A. thaliana I = 17, 14, … , 5. In the simulations of the REF design, the same reference parent (B73; Col-0) was chosen as initially described and the remaining I − 1 parents were randomly selected from the entire set of parental inbreds. In contrast, for the DIA design, I parents were randomly selected from the entire set of parents.

For each of the above-described mating designs, which differ with respect to the number of crosses NC (Table 1), I assumed a total number of RILs N = 1250, 2500, or 5000. The number of RILs per cross NP was calculated as follows: In scenarios with the number of remaining RILs r = N mod NC = 0, NP = N/NC. In contrast, in scenarios with r ≠ 0, I chose for r populations NP = N/NC + 1, whereas for the remaining NC − r populations NP = N/NC.

Definition of genotypic and phenotypic values:

A total of 100 simulation runs were performed for each of the examined mating designs. For each run, three subsets of SNPs (l = 25, 50, 100) were sampled at random without replacement from the linkage map and were defined as QTL. The SNP markers of my study are biallelic and, thus, the 25 diverse maize inbreds or the 19 A. thaliana inbreds used as parents carry either the same allele as the reference parent (B73; Col-0) or the nonreference parent allele. At each QTL, one allele was assigned the genotypic effect zero whereas the genotypic effect of the other allele was drawn randomly without replacement from the geometric series l(1 − a)[1, a, a2, a3, … , al−1], with a = 0.90 (25 QTL), a = 0.96 (50 QTL), or a = 0.99 (100 QTL) (Lande and Thompson 1990). Genotypic values of the inbreds were determined by summing up the effects of the individual alleles.

From the genotypic values of the RILs of each cross, the genotypic variance within the cross equation M1 was calculated. For the progenies of each cross, the phenotypic values were generated by adding a realization from a normally distributed variable equation M2 to the genotypic values, where h2 denotes the heritability on an entry-mean basis. On the basis of previous empirical studies, I examined h2 values of 0.5 and 0.8 (Flint-Garcia et al. 2005). All simulations were performed with software PLABSOFT (Maurer et al. 2008), which is implemented as an extension of the statistical software R (R Development Core Team 2004).

Statistical analyses:

The comparison of statistical analyses concerning the power 1 − β* requires an equal empirical type I error rate α*. To meet this requirement, I applied the following two-step procedure for QTL detection, which corresponds to that described by Stich et al. (2009). In a first step, stepwise multiple linear regression implemented in PLABQTL (Utz and Melchinger 1996) was used to select a set of cofactors based on the Schwarz (1978) Bayesian criterion, using the model

equation M3

where y is the vector of the phenotypic values of all RILs, μ is the intercept, bi is the regression coefficient of the ith marker locus, xi is an incidence vector of the genotypes of the RILs at the ith marker, and e is the vector of residual errors. I assumed that all RILs are genotyped with such a high number of markers that each QTL has a marker that is in complete LD with the QTL. Therefore, all SNPs, inclusive of those treated as QTL, were included in the QTL detection procedure.

In the second step, I calculated a P-value for the association of each marker q with the phenotypic value for an F-test with a full model against a reduced model,

equation M4

where bq (bc) is the regression coefficient of the qth marker locus (or cth cofactor) and xq (xc) is an incidence vector of the genotypes of the RILs at the qth marker (cth cofactor). The cq in the above formula indicates that from the total set of cofactors only those cofactors are used in the F-test of a specific marker that are not identical to the marker under consideration. This constraint is inevitable to detect also those QTL for which a cofactor was selected in the first step.

In addition to the above-described procedure for QTL detection, I used a procedure that accounts for the structure of the simulated RIL populations by including the mean value of the RILs derived from each cross (cf. Yu et al. 2008) in the model of each of the two above-described steps.

For each combination of N, l, and h2 examined for each mating design, the nominal α-level was chosen in such a way that the empirical type I error rate α* was 0.01 (Table S1). Due to the fact that none of the simulated QTL was monomorphic in any of the examined scenarios, the power for QTL detection (1 − β*) was calculated on the basis of this α-level as the average proportion of QTL correctly identified from the total number of QTL l.

RESULTS

For maize, the average map distance between the 653 SNP markers was 2.4 cM, whereas for A. thaliana the average map distance was 0.6 cM. The pairwise genetic dissimilarity among the 26 maize inbreds ranged from 0.25 to 0.42, where for A. thaliana values between 0.16 and 0.31 were observed. For maize, the average frequency of the allele of the reference parent B73 was 0.81 in the RILs of the REF design and ranged from 0.63 to 0.66 in the RILs of all other designs. In contrast, the frequency of the allele of the A. thaliana reference parent Col-0 was 0.89 in the RILs of the REF design and ranged from 0.76 to 0.79 in the RILs of all other designs.

For the 1250 RILs derived from the REF design of maize, a power to detect QTL 1 − β* of 0.603 was observed for the scenario with l = 25 QTL and h2 = 0.5 (Table 2). The duplication or quadruplication of the number of QTL from 25 to 50 or 100 resulted in a decrease of 1 − β* to about three-fourths or one-third of the initial value, respectively. For l = 25 QTL, an increase of h2 from 0.5 to 0.8 resulted in an increase of 1 − β* of about one half, where this increase was even more pronounced for l = 50 and 100 than for l = 25. The duplication or quadruplication of the number of RILs N from 1250 to 2500 or 5000 resulted in a small increase of 1 − β* for l = 25, a medium increase for l = 50, and a large increase for l = 100. Furthermore, the increase of 1 − β* was more pronounced for h2 = 0.5 than for h2 = 0.8. Across all scenarios of the REF design of maize, the nominal α-level ranged from 0.0002 (N = 1250; l = 100; h2 = 0.5) to 0.0133 (N = 5000; l = 25; h2 = 0.8) (Table S1).

TABLE 2
Power to detect QTL (α* = 0.01) and the corresponding standard error for different numbers N of maize recombinant inbred lines derived from different mating designs: reference (REF), diallel (DIA), factorial (FCT), single round-robin ...

The 1 − β* trends observed for RIL populations derived from the non-REF designs upon changes of l, h2, and N were similar to that found for the REF design (Table 2). For N = 1250, the ranking of the various designs with respect to the 1 − β* values across all examined levels of l and h2 was FCT > DIA > DRR > DB60 > SRR > DB30 > RRR > IRR > DB15 > REF. The duplication or quadruplication of N from 1250 to 2500 or 5000 resulted in a shift of FCT to rank four and of DB30 to rank seven. The trends of the nominal α-level observed for RIL populations derived from the non-REF designs upon changes of l, h2, and N were similar to that found for the REF design.

Across all examined scenarios, the 1 − β* values observed for A. thaliana were between one-tenth and one-fifth lower than those observed for maize (Table 3). The 1 − β* trends observed for A. thaliana RIL populations derived from all designs upon changes of l, h2, and N were similar to those found for maize. For A. thaliana, the ranking of the examined designs with respect to the 1 − β* values across all examined levels of l, h2, and N was FCT > DIA > DB60 > DRR > SRR > DB30 > RRR > DB15 > IRR > REF. The trends of the nominal α-level observed for A. thaliana RIL populations were similar to those found for maize.

TABLE 3
Power to detect QTL (α* = 0.01) and the corresponding standard error for different numbers N of Arabidopsis thaliana recombinant inbred lines derived from different mating designs: reference (REF), diallel (DIA), factorial (FCT), ...

Decreasing the number of parental inbreds involved in the development of RIL populations of maize resulted in a decrease of 1 − β* (Figure 1). This decrease of 1 − β* was more pronounced for A. thaliana than for maize. For both species, the decrease of 1 − β* upon the reduction of I was stronger for scenarios with a low number of QTL and high values for h2 than vice versa (Figure 1, A and B). Across all examined levels of l and h2, the decrease of 1 − β* upon the reduction of I was slightly more pronounced for the DIA design than for the REF design, whereas N did not influence this trend.

Figure 1.
Power to detect QTL (α* = 0.01) of maize (A and C) and A. thaliana (B and D) recombinant inbred line (RIL) populations. (A and B) N = 2500 RILs were derived from the reference (REF) design using various numbers of parental ...

Across all examined scenarios, the 1 − β* values observed for maize as well as A. thaliana on the basis of the QTL detection procedure that takes into account the structure of the RIL populations were between one-third and one-fourth lower than those observed for the QTL detection that ignores this structure (data not shown). For the former QTL detection procedure, the 1 − β* trends observed with respect to the examined mating designs as well l, h2, and N were similar to those found for the latter QTL detection procedure.

DISCUSSION

In contrast to previous joint linkage and LD studies, which focused on mining existing mapping population in pedigrees or heterogeneous stocks (e.g., Meuwissen et al. 2002), the NAM strategy proposed by Yu et al. (2008) aims to create an integrated mapping population specifically designed for a full genome scan for QTL. One idea of this strategy is that with common-parent-specific (CPS) markers genotyped for the parental inbreds and the RILs, the inheritance of chromosome segments nested within two adjacent CPS markers can be inferred through linkage. Genotyping the founders with additional high-density markers enables the projection of genetic information, capturing LD information, from the parental inbreds to the RILs. This approach is expected to allow high-resolution QTL mapping with a relatively low number of markers in the RILs.

However, using other designs than the REF design, the CPS marker strategy is not straightforward to implement. Therefore, in the current study, I assumed that all RILs are genotyped with such a high number of markers that each QTL has a marker that is in complete LD with the QTL. Due to the fast progress of genome sequencing techniques (Shendure et al. 2004), this is a realistic assumption in the foreseeable future. Therewith, it will be possible to exploit both recent and ancient recombination in RIL populations derived from other mating designs than the REF design.

QTL detection procedures for NAM populations:

A NAM population consists of a large number of segregating populations (Table 1). The existence of alleles that are specific for some of the segregating populations can lead to experimentwide LD between the causal gene and some unlinked markers. This, however, has the potential to increase the rate of false positive associations when applying QTL detection procedures that neglect the structure of the NAM population (Yu et al. 2008). Therefore, I used in addition to such a QTL detection procedure a procedure that accounts for the structure of the simulated RIL populations by including the mean value of the RILs derived from each cross as a covariate. However, I observed for none of the examined mating designs a considerable difference with respect to the nominal α-level, which is required to obtain an empirical α-level of 0.01, between the two examined QTL detection procedures. This finding suggested that the above-mentioned issue of experimentwide LD between the causal gene and some unlinked markers might be neglected in the current study. Furthermore, because I observed for the QTL detection procedure neglecting the structure of the NAM population considerably higher 1 − β* estimates than for the QTL detection procedure accounting for it, only the results of the former method are discussed below. Nevertheless, further research on the most appropriate QTL detection procedure for NAM populations is required.

Power to detect QTL under different mating designs in maize:

Across all the examined scenarios of maize, the lowest power 1 − β* was observed for the RILs derived from the REF design (Table 2). This observation is in accordance with results of Stich et al. (2007), who compared RIL populations derived from different designs with respect to their power to detect three-way epistatic interactions. These findings might be attributable to the fact that the average frequency of the common parent allele was closer to 1 for RILs derived from the REF design than for all other designs. Crossing schemes that result in RILs with an average allele frequency strongly deviating from 0.5 have a low power to detect QTL because the probability that some QTL alleles are present in only a very low number of RILs is maximized (Verhoeven et al. 2006).

Despite this disadvantage of the REF design, the project “molecular and functional diversity of the maize genome” applied this crossing scheme to establish a NAM population in maize (Yu et al. 2008). The main advantage of this crossing scheme is that crossing the 25 diverse inbreds to the inbred B73, which is well adapted to U.S. environmental conditions, facilitates the development as well as the phenotyping of RILs within the United States (Yu et al. 2008). This issue, however, might be of lower importance for the choice of the most appropriate crossing scheme to establish NAM populations (i) based on another set of parental maize inbreds as well as (ii) for other plant species with a lower genetic diversity than that of the parental inbreds used in the project “molecular and functional diversity of the maize genome.” Therefore, the 1 − β* estimates observed for the other crossing schemes are discussed below.

My results revealed a lower power 1 − β* for the DB designs than for the DIA, FCT, and DRR crossing schemes (Table 2). This observation is in contrast to results of Stich et al. (2007), who observed for optimally allocated DB designs a higher power to detect three-way epistatic interactions than for the DIA crossing scheme. The development of RIL populations from pairs of parental inbreds, which were selected to maximize the pairwise genetic dissimilarity, increases indeed the average probability that QTL are segregating. Nevertheless, such a selection can also lead to the fixation of some QTL, because in contrast to the other designs not all parental inbreds are used for the establishment of the segregating populations. In scenarios with a low power for QTL detection such as that of Stich et al. (2007), the fixation of some QTL has only marginal effects and, thus, the increased average probability that QTL are segregating of DB designs can be used. However, in scenarios with a power 1 − β* close to 1, like in the present study, it is indispensable that all QTL are polymorphic. Thus, balanced crossing schemes such as DIA, FCT, or DRR might be superior to DB designs in scenarios with a high power to detect QTL and vice versa.

Across all scenarios of maize, my results revealed a higher power 1 − β* for RILs derived from the crossing schemes DIA, FCT, and DRR than for the SRR, RRR, and IRR mating designs (Table 2). This observation cannot be explained by differences in allele frequencies, as the RILs derived from all designs except REF showed similar allele frequencies. Partly, my finding might be explained by the large number of small populations derived from the former designs (Table 1). This explanation is in contrast to results of Verhoeven et al. (2006). The different findings can be explained by the different assumptions underlying the simulations. First, Verhoeven et al. (2006) detected QTL within individual RIL populations whereas in my study QTL were detected across all RIL populations. Second, Verhoeven et al. (2006) assumed a distinct allele for each parental inbred. In these cases, large numbers of small populations show, due to the increased probability that some QTL alleles have only a very small class size, a lower power to detect QTL than do a small number of large populations. However, the assumptions made by Verhoeven et al. (2006) ignore the fact that for real data not all QTL segregate in every population (Xu 1996). In my study, this fact was considered by using SNP data of parental inbreds as a basis of the simulations. Consequently, the mating designs resulting in a large number of small populations have indeed the above-mentioned disadvantage but this is compensated by the large number of individuals within populations segregating for the QTL.

For maize, the results of my study indicated that the DIA and FCT crossing schemes result in the highest power for QTL detection. However, to establish the number of crosses required for these mating designs might be realistic only for an outcrossing species such as maize. In contrast, the number of crosses necessary for the DRR mating design are considerably lower than that required for the DIA and FCT crossing schemes (Table 1). Nevertheless, for all three mating designs similar 1 − β* estimates were observed. These findings suggested that the DRR crossing scheme might be the most appropriate design to establish NAM populations in autogamous or partial autogamous species such as barley, wheat, or rapeseed.

Factors influencing the power for QTL detection and the relative performance of crossing schemes to establish NAM populations:

Theoretical considerations suggest that the power for QTL detection 1 − β* but also the relative performance of different crossing schemes to establish NAM populations might be influenced by (i) the plant species examined, (ii) the genetic architecture of the trait under consideration, (iii) the total population size, (iv) the inbreeding procedure, and (v) the number of parental genotypes used.

Plant species:

Across all crossing schemes, lower 1 − β* estimates were observed for A. thaliana NAM populations than for maize NAM populations of similar size (Tables 2 and and3).3). This finding might be due to the higher frequency of the reference allele in A. thaliana compared with the same design of maize. Another explanation might be the four times lower average map distance between the SNP markers in A. thaliana compared with that in maize. Thereby, in A. thaliana, the recombination between the markers is reduced, which is expected to increase the type I error rate. This decreases the power for QTL detection, however, when fixing the empirical type I error rate as described in materials and methods.

My results revealed only slight differences between the rankings of the various crossing schemes with respect to 1 − β* for maize and A. thaliana. This observation suggested that my conclusions regarding the most appropriate crossing scheme might also be valid for other plant species.

Genetic architecture of the trait:

A higher-power 1 − β* was observed for traits influenced by a low number of QTL than for traits influenced by a high number of QTL (Tables 2 and and3).3). Similarly, increasing h2 from 0.5 to 0.8 resulted for all examined designs and all numbers of QTL in a considerably higher power to detect QTL. These observations are in accordance with quantitative genetic theory and previous studies (e.g., Van Ooijen 1992; Beavis 1994; Falconer and Mackay 1996) and can be explained by the fact that in the former case each QTL explains a higher proportion of the phenotypic variance than in the latter.

The ranking of the various crossing schemes differed slightly among the three QTL scenarios as well as between the two heritability scenarios. However, the observed differences followed no clear trend.

Total population size:

Across all examined designs, a higher power for QTL detection was observed for populations with a higher number of entries (Tables 2 and and3).3). This observation is in accordance with results of Schön et al. (2004) and can be explained by the fact that in this case the allele effects are estimated more precisely.

The ranking of the various crossing schemes differed only slightly among the three levels examined for the total number of RILs. Therefore, I expect that my findings are valid for a broad range of total population sizes.

Inbreeding procedure:

The use of inbred genotypes in QTL mapping experiments has several advantages (Burr et al. 1988; Lander and Botstein 1989). Due to the short generation time, such individuals are created for A. thaliana by repeated self-pollination. Most crop species, however, have considerably longer generation times. Therefore, the creation of fully homozygous genotypes in one step via doubled-haploid (DH) induction (Jensen 1975; Bajaj 1977; Bordes et al. 1997; Wenzel et al. 1977) is an interesting alternative to repeated self-pollination and, thus, was examined in my study (data not shown).

My results revealed a power for QTL detection of DH populations derived from F1 genotypes that is similar to that observed for RIL populations of identical size. However, I observed across all the examined scenarios a considerably higher power for DH populations that were derived from F2 genotypes. This observation might be explained by the additional recombinations that occurred before the induction of DHs (cf. Bernardo 2009). Nevertheless, the use of RILs in a NAM context is justified if the ultimate objective of the experiment is to clone the QTL. In this case, the use of heterogenous inbred families (Tuinstra et al. 1997) derived from RILs proved to be a powerful tool (cf. Fridman et al. 2000).

Number of parental genotypes:

Across all examined scenarios of maize and Arabidopsis, I observed a higher power 1 − β* for NAM populations that were established using a high number of parental inbreds than a low number (Figure 1). This finding can be explained by the fact that a higher number of parental inbreds increases the number of polymorphic QTL. Therefore, my findings indicate to use a high number of parental inbreds for the creation of NAM populations.

For A. thaliana, a linear increase of 1 − β* was observed with an increase of the parental inbreds from 5 to 20. In contrast, for maize, the increase of the power for QTL detection was high for an increase of the parental inbreds from 5 to 14 but was considerably lower for an increase from 14 to 26. These observations might be due to the higher number of rare alleles in A. thaliana compared with maize.

For the REF as well as the DIA crossing scheme, I observed a similar increase of 1 − β* with an increasing number of parental inbreds. This finding suggested that the ranking of the crossing schemes with respect to 1 − β* is not or only marginally influenced by the number of parental inbreds used.

Conclusions:

My finding of considerable differences in 1 − β* estimates between NAM populations of the same size but created on the basis of different crossing schemes illustrated the potential to improve the power for QTL detection without increasing the total resources necessary for a QTL mapping experiment. For maize as well as A. thaliana, I observed the highest power for QTL detection for the DIA and FCT crossing schemes. However, for species with a high genetic diversity, such as maize, it will be difficult to generate high-quality phenotypic values in field trials with RIL populations derived from crosses between diverse material. Furthermore, these designs require creation of a high number of crosses, which might be difficult in autogamous or partial autogamous species such as barley, wheat, or rapeseed. For these species, the DRR crossing scheme might be a promising alternative, because it requires the creation of only a relatively low number of crosses, while almost the same 1 − β* estimates were observed as the DIA and FCT designs. Finally, my results clearly indicated that it is advantageous to create NAM populations from a large number of parental inbreds.

Acknowledgments

I thank Edward S. Buckler and Detlef Weigel for providing the genotypic information for the maize and A. thaliana inbreds, respectively. Furthermore, I thank Maarten Koornneef for critical reading of the manuscript. I thank the Plant Computational Biology group of the Max Planck Institute for Plant Breeding Research for use of their computer cluster as well as the associate editor and two anonymous reviewers for their valuable suggestions. Funding for this work was provided by the Max Planck Society.

Notes

Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.108449/DC1.

References

  • Alonso, J. M., A. N. Stepanova, T. J. Leisse, C. J. Kim, H. Chen et al., 2003. Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science 301 653–657. [PubMed]
  • Bajaj, Y. P. S., 1977. In vitro induction of haploids in wheat (Triticum aestivum L.). Crop Improv. 4 54–64.
  • Beavis, W. D., 1994. The power and deceit of QTL experiments: lessons from comparative QTL studies, pp. 250–266 in 49th Annual Corn and Sorghum Industry Research Conference. American Seed Trade Association, Washington, DC.
  • Bernardo, R., 2009. Should maize doubled haploids be induced among F1 or F2 plants? Theor. Appl. Genet. 119 255–262. [PubMed]
  • Bordes, J., R. D. de Vaulx, A. Lapierre and M. Pollacsek, 1997. Haplodiploidization of maize (Zea mays L.) through induced gynogenesis assisted by glossy markers and its use in breeding. Agronomie 17 291–297.
  • Buckler, E. S., J. M. Holland, P. J. Bradbury, C. B. Acharya, P. J. Brown et al., 2009. The genetic architecture of maize flowering time. Science 325 714–718. [PubMed]
  • Burr, B., F. A. Burr, K. H. Thompson, M. C. Albertsen and C. W. Stuber, 1988. Gene mapping with recombinant inbreds in maize. Genetics 118 519–526. [PMC free article] [PubMed]
  • Clark, R. M., G. Schweikert, C. Toomajian, S. Ossowski, G. Zeller et al., 2007. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317 338–342. [PubMed]
  • Comstock, R. E., and H. F. Robinson, 1948. The components of genetic variance in populations of biparental progenies and their use in estimating the average degree of dominance. Biometrics 4 254–266. [PubMed]
  • Falconer, D. S., and T. F. C. Mackay, 1996. Introduction to Quantitative Genetics, Ed. 4. Longman Group, London.
  • Flint-Garcia, S. A., A. Thuillet, J. Yu, G. Pressoir, S. M. Romero et al., 2005. Maize association population: a high resolution platform for QTL dissection. Plant J. 44 1054–1064. [PubMed]
  • Fridman, E., T. Pleban and D. Zamir, 2000. A recombination hotspot delimits a wild-species quantitative trait locus for tomato sugar content to 484 bp within an invertase gene. Proc. Natl. Acad. Sci. USA 97 4718–4723. [PMC free article] [PubMed]
  • Griffing, B., 1956. Concept of general and specific combining ability in relation to diallel crossing systems. Aust. J. Biol. Sci. 9 463–493.
  • Holland, J. B., 2007. Genetic architecture of complex traits in plants. Curr. Opin. Plant Biol. 10 156–161. [PubMed]
  • Jensen, C. J., 1975. Barley monoploids and double monoploids: techniques and experience, pp. 316–345 in Barley Genetics IV, edited by H. Gaul. Thiemig, München, Germany.
  • Kim, S., V. Plagnol, T. T. Hu, C. Toomajian, R. M. Clark et al., 2007. Recombination and linkage disequilibrium in Arabidopsis thaliana. Nat. Genet. 39 1151–1155. [PubMed]
  • Kraakman, A. T. W., R. E. Niks, P. M. M. M. Van den Berg, P. Stam and F. A. Eeuwijk, 2004. Linkage disequilibrium mapping of yield and yield stability in modern spring barley cultivar. Genetics 168 435–446. [PMC free article] [PubMed]
  • Lande, R., and R. Thompson, 1990. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124 743–756. [PMC free article] [PubMed]
  • Lander, E. S., and D. Botstein, 1989. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121 185–199. [PMC free article] [PubMed]
  • Liu, K., M. Goodman, S. Muse, J. S. Smith, E. Buckler et al., 2003. Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites. Genetics 165 2117–2128. [PMC free article] [PubMed]
  • Maurer, H. P., A. E. Melchinger and M. Frisch, 2008. Population genetical simulation and data analysis with Plabsoft. Euphytica 161 133–139.
  • McMullen, M. M., S. Kresovich, H. Sanchez Villeda, P. Bradbury, H. Li et al., 2009. Genetic properties of the maize nested association mapping population. Science 325 737–740. [PubMed]
  • Meuwissen, T. H., A. Karlsen, S. Lien, I. Olsaker and M. E. Goddard, 2002. Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping. Genetics 161 373–379. [PMC free article] [PubMed]
  • Nei, M., and W. H. Li, 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. USA 76 5269–5273. [PMC free article] [PubMed]
  • Nordborg, M., T. T. Hu, Y. Ishino, J. Jhaveri, C. Toomajian et al., 2005. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 3 e196. [PMC free article] [PubMed]
  • R Development Core Team, 2004. R: A Language and Environment for Statistical Computing. Vienna.
  • Schön, C. C., H. F. Utz, S. Groh, B. Truberg, S. Openshaw et al., 2004. Quantitative trait locus mapping based on resampling in a vast maize testcross experiment and its relevance to quantitative genetics for complex traits. Genetics 167 485–498. [PMC free article] [PubMed]
  • Schwarz, G., 1978. Estimating the dimension of a model. Ann. Stat. 6 461–464.
  • Shendure, J., R. D. Mitra, C. Varma and G. M. Church, 2004. Advanced sequencing technologies: methods and goals. Nat. Rev. Genet. 5 335–344. [PubMed]
  • Singer, T., Y. Fan, H. S. Chang, T. Zhu, S. P. Hazen et al., 2006. A high-resolution map of Arabidopsis recombinant inbred lines by whole-genome exon array hybridization. PLoS Genet. 2 e144. [PMC free article] [PubMed]
  • Stich, B., J. Yu, A. E. Melchinger, H.-P Piepho, H. F. Utz et al., 2007. Power to detect higher-order epistatic interactions in a metabolic pathway using a new mapping strategy. Genetics 176 563–570. [PMC free article] [PubMed]
  • Stich, B., H. F. Utz, H.-P. Piepho, H. P. Maurer and A. E. Melchinger, 2009. Optimum allocation of resources for QTL detection using a nested association mapping strategy in maize. Theor. Appl. Genet. (in press). [PMC free article] [PubMed]
  • Tuinstra, M. R., G. Ejeta and P. B. Goldsbrough, 1997. Heterogeneous inbred family (HIF) analysis: a method for developing near-isogenic lines that differ at quantitative trait loci. Theor. Appl. Genet. 95 1005–1011.
  • Utz, H. F., and A. E. Melchinger, 1996. PLABQTL: a program for composite interval mapping of QTL. J. Quant. Trait Loci 2 1–5.
  • van Ooijen, J. W., 1992. Accuracy of mapping quantitative trait loci in autogamous species. Theor. Appl. Genet. 84 803–811. [PubMed]
  • Verhoeven, K. J. F., J.-L. Jannink and L. M. McIntyre, 2006. Using mating designs to uncover QTL and the genetic architecture of complex traits. Heredity 96 139–149. [PubMed]
  • Wenzel, G., F. Hoffman and E. Thomas, 1977. Anther culture as a breeding tool in rape. I. Ploidy level and phenotype of androgenetic plants. Z. Pflanzenzücht. 78 149–155.
  • Willer, C. J., S. Sanna, A. U. Jackson, A. Scuteri, L. L. Bonnycastle et al., 2008. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat. Genet. 40 161–169. [PubMed]
  • Wilson, L. M., S. R. Whitt, A. M Ibáñez, T. R. Rocheford, M. M. Goodman et al., 2004. Dissection of maize kernel composition and starch production by candidate gene association. Plant Cell 16 2719–2733. [PMC free article] [PubMed]
  • Xu, S., 1996. Mapping quantitative trait loci using four-way crosses. Genet. Res. 68 175–181.
  • Yano, M., 2001. Genetic and molecular dissection of naturally occurring variation. Curr. Opin. Plant Biol. 4 130–135. [PubMed]
  • Yu, J., and E. Buckler, 2006. Genetic association mapping and genome organization of maize. Curr. Opin. Biotech. 17 155–160. [PubMed]
  • Yu, J., J. B. Holland, M. D. McMullen and E. S. Buckler, 2008. Power analysis of an integrated mapping strategy: nested association mapping. Genetics 138 539–551.

Articles from Genetics are provided here courtesy of Genetics Society of America
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...