• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of hdyLink to Publisher's site
Heredity. Jan 2011; 106(1): 124–133.
Published online Mar 24, 2010. doi:  10.1038/hdy.2010.20
PMCID: PMC3183861

Bayesian analysis for genetic architecture of dynamic traits

L Min,1,2 R Yang,1,* X Wang,1 and B Wang1

Abstract

The dissection of the genetic architecture of quantitative traits, including the number and locations of quantitative trait loci (QTL) and their main and epistatic effects, has been an important topic in current QTL mapping. We extend the Bayesian model selection framework for mapping multiple epistatic QTL affecting continuous traits to dynamic traits in experimental crosses. The extension inherits the efficiency of Bayesian model selection and the flexibility of the Legendre polynomial model fitting to the change in genetic and environmental effects with time. We illustrate the proposed method by simultaneously detecting the main and epistatic QTLs for the growth of leaf age in a doubled-haploid population of rice. The behavior and performance of the method are also shown by computer simulation experiments. The results show that our method can more quickly identify interacting QTLs for dynamic traits in the models with many numbers of genetic effects, enhancing our understanding of genetic architecture for dynamic traits. Our proposed method can be treated as a general form of mapping QTL for continuous quantitative traits, being easier to extend to multiple traits and to a single trait with repeat records.

Keywords: Bayesian model selection, dynamic trait, QTL, epistatic, Legendre polynomial

Introduction

The process of formation and development of a biological trait may have different temporal and spatial properties. Such a trait whose phenotype changes with time or quantitative factor is known as a dynamic trait. Biologically speaking, the change in phenotype of the trait can be due to different genes that turn on or off at various times. In other words, the dynamic trait is governed by some genes whose genetic effects change with time. Studying the changing laws of these gene effects and their mutual relationships can enhance our understanding of the genetic architecture of dynamic traits.

The genetic mechanism of dynamic traits has been observed in practice on mapping quantitative trait loci (QTL) for some fixed time points in dynamic traits, with separate analysis (Cheverud et al., 1983; Nuzhdin et al., 1997; Verhaegen et al., 1997; Emebiri et al., 1998; Wu et al., 1999), joint analysis (Jiang and Zeng, 1995; Korol et al., 1995; Ronin et al., 1995; Eaves et al., 1996; Knott and Haley, 2000) or conditional analysis (Yan et al., 1998a, 1998b; Wu et al., 2002). Subsequently, the research focus has been gradually shifted toward betterfitting changing laws in genetic effects of QTL genotypes and genes. Wu and his colleagues (Ma et al., 2002; Wu et al., 2004a, 2004b, 2004c; Wu and Lin, 2006) proposed a functional mapping strategy constructed within the context of interval mapping, where the mean vectors of QTL genotypes within a time interval are modeled by a biologically meaningful mathematical equation, and the covariance matrix is modeled in terms of its time series autocorrelation structure (Ma et al., 2002). Fitting the Legendre polynomials to the time-dependent genetic effects of markers outside the test interval, Yang et al. (2007) presented a flexible nonparametric approach for composite functional mapping of dynamic traits. Although these functional mapping strategies have emerged as a powerful tool for mapping dynamic trait loci, using nonlinear biologically meaningful mathematical model-to-model changes of QTL genotype effects may limit their extension to a multiple QTL model. Moreover, there is still a lack of biologically meaningful mathematical models for most dynamic traits.

The Legendre polynomial has been extensively used by animal geneticists and breeders to fit changes in breeding values for milk production and other dynamic traits (Kirkpatrick and Heckman, 1989; Kirkpatrick et al., 1990; Schaeffer, 2004). This has stimulated several usages of the Legendre polynomial in QTL mapping for dynamic traits. For example, Yang et al. (2004) and Huang et al. (2005) replaced the Logistic curve with the Legendre polynomial and made functional mapping suitable for dynamic traits with an arbitrary shape. In Macgregor et al. (2005), the Legendre polynomial was applied to QTL mapping for longitudinal traits in pedigrees. They adopted the traditional random regression model in which the vector of polynomial regression coefficients (genetic effects) for each animal is treated as a random vector sampled from a multivariate normal distribution. For line crosses, Yang et al. (2006) proposed an interval mapping method for dynamic traits by using the Legendre polynomial to model the population mean, QTL effects and time-dependent environmental effects. On the basis of this interval mapping method, Yang and Xu (2007) subsequently developed a Bayesian shrinkage analysis framework to simultaneously map genome-wide QTLs with multiple main effects for dynamic traits.

The dissection of the genetic architecture of quantitative traits, including the number and locations of QTLs and their main and epistatic effects, becomes an important topic in current QTL mapping. In fact, the unknown number of QTLs and the possibly huge number of epistatic effects make the issue extremely complex. A promising approach for solving the issue is the Bayesian model selection framework, which has been developed to identify epistatic QTL for regular quantitative traits (Yi et al., 2005, 2007b) and for ordinal traits (Yi et al., 2007a), but not for dynamic traits.

In this study, we will extend the Bayesian model selection for mapping interacting QTLs developed by Yi et al. (2005, 2007b) to dynamic traits in experimental crosses. The extension is realized by embedding the Legendre polynomials in QTL effects and taking into account the individual-specific time-dependent random environmental effects in the genetic model. Our extension inherits the efficiency of the Bayesian model selection and the flexibility of the Legendre polynomial model fitting to change in genetic and environmental effects with time, and can fairly quickly identify interacting QTLs for dynamic traits in models with large numbers of genetic effects, which are demonstrated by analyzing the simulated data and real data on leaf age growth in rice.

Methods

Genetic model

We start with a simple population including only two segregating genotypes at each locus, such as a backcross (BC), double-haploid lines or recombinant inbred lines. For mapping the QTL of dynamic traits, phenotypes of repeated measurements in time interval [t0, tm] and molecular marker data need to be collected on n individuals. Assume that there are q QTL responsible for changing the trajectory of dynamic traits. The phenotypic value yi(t) of individual i measured at time t can be then described by the following multiple interacting QTL model (Kao and Zeng, 2002):

equation image

where μ(t) is the population mean at time t; βj(t) for j=1, 2,…,q is the additive effect of the jth QTL at time point t; δjk(t) is the epistatic effect between jth QTL and kth QTL for j=1, 2,…,q−1; k=j+1, j+2,…,q; xij is a genotype indicator variable for individual i at locus j and is defined as 1 for one genotype and −1 for the other genotype; zijk is the dummy variable for epistatic effect between jth QTL and kth QTL on ith individual, zijk=xijxik; γ is a binary variable for each genetic effect, indicating whether the corresponding effect is included (γ=1) or excluded (γ=0) from model (1); ξi(t) is an individual-specific time-dependent random environmental effect, distributed as N (0, σξ2(t)); and epsiloni(t) is a time-independent random residual error, following the normal distribution with mean 0 and variance σ2. Notice that by inferring γ, the Bayesian model selection enables the Markov Chain Monte Carlo (MCMC) sampling for QTL parameters to be conducted in a reduced model space (Carlin and Chib, 1995; Yi, 2004).

The Legendre polynomial of p orders is chosen to fit the changing trajectories of the population mean, QTL effects and residual error. Let ψ(t) be the basis of the Legendre polynomial (see Yang et al., 2006) and stipulate that μ(t)=ψ(t)μ, βj(t)=ψ(t)βj, δjk(t)=ψ(t)δjk and ξi (t)=ψ(t)ξi, where μ, βj, δjk and ξi are the p+1 vectors of the regression coefficients. Model (1) can then be rewritten as

equation image

Assume that ξi is i.i.d. N(0, Σ), where Σ is a (p+1) × (p+1) positive definite covariance matrix.

For simplicity of description, we assume that each individual has m measurements at m different time points and that the time points are common for all individuals. However, our method can accommodate the data from arbitrary time points. Let yi=[yi (t0) yi (t1) … yi(tm)]T be a (m+1) × 1 column vector for the repeated measurements of the dynamic traits, and define ψ=[ψT(t0) ψT(t1) … ψT(tm)] as a (p+1) × (m+1) matrix. In matrix notation, model (2) becomes

equation image

where An external file that holds a picture, illustration, etc.
Object name is hdy201020e4.jpg is a (m+1) × 1 vector for the environmental errors with epsiloni~N (0, 2), where I is an (m+1) × (m+1) identity matrix. The conditional expectation of model (3) given the fixed effects, such as population mean and genetic effects, is

equation image

and the variance–covariance matrix is

equation image

for all i=1, 2,…, n.

Bayesian mapping

Similar to the cases for regular quantitative traits, the Bayesian mapping framework implemented in MCMC algorithms for dynamic traits mainly consists of six consecutive parts: (1) to establish the likelihood function for phenotypes according to the given genetic model reflecting the relationship between phenotypes and unknown parameters; (2) to specify the prior distribution for each unknown parameter; (3) to form the joint posterior distribution by multiplying the likelihood function from step 1 by all prior distributions from step 2; (4) to obtain the conditional posterior distribution for each unknown parameter by fixing other parameters in joint posterior distribution; (5) to draw MCMC samples for each unknown parameter from the corresponding conditional posterior distributions and (6) to analyze the posterior samples for each parameter and statistically characterize them. In contrast to regular quantitative traits, however, Bayesian mapping for dynamic traits is more complex because of the consideration of time dependence of QTL effects and random environmental effects on traits of interest.

Likelihood function

Denote the phenotypic observations y={yi} for i=1, 2,…, n, the unknown parameters γ={γj γjk}, X={xij zijk}, λ={λj} with λj being the position of the jth QTL and θ={μ βj δjk ξi Σ σ2} for j=1, 2,…, q; k=j+1, j+2,…, q. The likelihood function is the conditional distribution of y given γ, X and θ, which is denoted by:

equation image

Prior distribution

Notice that the genetic effects of QTLs on dynamic traits in models (1)–(3) are equivalent to nesting the Legendre polynomial within the genetic effects of QTL on regular quantitative traits. Therefore, in Bayesian mapping for dynamic traits, choices of the upper bound L and specification of the prior on γ and λ should be the same as those for regular quantitative traits. As described by Yi et al. (2005), we take L as l0+3√l0, where l0 is the prior expected number of QTLs and is determined according to initial investigations with traditional methods. The binary indicator γ is assumed to have an independent prior An external file that holds a picture, illustration, etc.
Object name is hdy201020e8.jpg where w is the prior inclusion probability for a certain QTL effect and equals the predetermined hyperparameter wm for main effects or we for epistatic effects, respectively. Priors on λ are assumed to be independent and uniformly distributed over the entire genome, that is, QTL positions have a uniform prior information.

The prior for the population mean μ is N (μ0, Σ0). We can empirically set

equation image

where bi=(ψTψ)−1ψTyi and is a vector of regression coefficients obtained by fitting the individual dynamic trajectory.

We propose the following hierarchical mixture prior for each additive genetic effect,

equation image

with

equation image

and c being taken to n such that the prior variance of each fixed effect stays approximately the same as n increases. Similarly, we take the prior distribution for epistatic effect as

equation image

with

equation image

The random effects ξi are assumed to have an independent multivariate normal distribution, that is, ξi~Np+1(0, Sa) with the hyperparameter Sa being a (p+1) × (p+1) matrix.

An inverse Wishart prior is chosen for the covariance matrix of regression coefficients for random environmental effect, denoted by Σ~IW (νa, νaSa) with νa being a hyperparameter.

The residual variance is assigned to be a scaled inverse χ2 distribution, that is, An external file that holds a picture, illustration, etc.
Object name is hdy201020e14.jpg with νe and se being hyperparameters.

Genotypes of missing markers were generated randomly in each iteration on the basis of the probability inferred jointly from the nearest nonmissing flanking markers and the phenotype. The probability from the missing marker locus is treated as the prior probability. After incorporation of the marker (Locus) effects through the phenotype, the probability becomes the posterior probability, which is used to generate the missing marker genotype from multinomial distribution. The detailed calculation of posterior probabilities for missing marker genotypes can be found in Wang et al. (2005).

The joint prior of all parameters takes the product of the priors of individual parameters.

MCMC algorithm

In general, the joint posterior density derived from likelihood function and the joint priors of all parameters are intractable analytically. However, MCMC methods such as the Gibbs sampler (Gelman et al., 1995) and the Metropolis–Hastings algorithm (Metropolis et al., 1953; Hastings, 1970) can be used to draw samples, from which features of marginal distributions of interest can be inferred.

Within the framework of the Bayesian Model selection, the upper bound L on the number of QTLs is not only given, but also the released sampling value for γ at current iteration determined which genetic effect and QTL position will be drawn or estimated at the next iteration. This allowed us to conduct Bayesian sampling for QTL parameters in a reasonably reduced model space, thus greatly decreasing the computational demand.

On the basis of marginal posterior distribution for each parameter (shown in Appendix A), we implement MCMC sampling by the following computationally efficient process:

  1. Evenly partition the entire genome into small intervals (1 or 2 cM long) by a number of points and restrict putative QTLs to these fixed points. Estimate all expected values of indicator variables X for putative QTL by using conditional probabilities of their genotypes on two flanking markers.
  2. Divide the entire genome into L equal intervals and put one QTL in the middle of each interval.
  3. Initialize all variables with some legal values or values sampled from their prior distributions;
  4. Update the population mean μ;
  5. Update the binary indicators γ with an efficient Metropolis–Hastings algorithm (Kohn et al., 2001; Yi et al., 2007a);
  6. Update the additive QTL effects βj corresponding to γj=1;
  7. Update the epistatic QTL effects δjk corresponding to γjk=1;
  8. Update the residual variance σ2;
  9. Update the QTL position λj on those fixed points, corresponding to γ=1;
  10. Repeat steps (4)–(9) until the Markov chain reaches a desirable length.

As the order of V equals the number of repeat measurements for dynamic traits, it is hard to calculate the inverse and determinant for V when there are a large number of repeat measurements. In practices of MCMC sampling, therefore, the inverse and determinant for V need to be solved in the form of the reduced dimension. The detailed derivation of the simplified formula is given in Appendix B.

For analyzing the models with multiple interacting QTLs and only multiple main-effect QTLs by using the Bayesian model selection, we write the program to implement MCMC sampling in Matlab, which can be available from the authors on request.

Post-MCMC analysis

The posterior sample can be used to infer the genetic architecture of quantitative traits, including the number and locations of QTL and their main and epistatic effects. Before doing these, we need to monitor the mixing behavior and convergence rates of MCMC algorithms by visually inspecting trace plots of the sample values of scalar quantities of interest or by using formal diagnostic methods provided in the package R/coda (Plummer et al., 2006). Model averaging accounts for model uncertainty provide more robust inference compared with a single optimal model approach (Raftery et al., 1997; Ball, 2001; Sillanpää and Corander, 2002) and are therefore used to assess the characteristics of genetic architecture by averaging over possible models weighted by their posterior probabilities. We can use various methods to graphically and numerically summarize and interpret the posterior samples. The posterior inclusion probability for each locus is estimated as its frequency in the posterior samples; taking the prior probability into consideration, we use Bayes factors (BFs) to show evidence for inclusion against exclusion of each QTL effect. The BF for a locus or QTL effect is defined as the ratio of the posterior odds to the prior odds for inclusion against exclusion of the QTL locus or effect (Kass and Raftery, 1995). Generally, a threshold of BF is taken to 3 or 2 ln BF=2.1, for declaring statistical significance for each QTL effect (Kass and Raftery, 1995).

Real data analysis

A doubled-haploid (DH) population with 111 lines was generated by crossing an indica rice variety Gui-630 and a japonica rice variety Taiwanjing. A linkage map composed of 175 RFLP markers was constructed using the DH population, covering a total length of 1225 cM with average spacing of 7 cM (Weng et al., 2000). This DH population was grown with replicates in a field trial (Zhou et al., 2001). For each plant, the number of developed leaves on the main stem was counted, and the length of the developing leaf was measured every 3–7 days from day 30 after sowing until the full development of the leaf. These measured data were used to estimate the leaf age of a plant (y) using

equation image

The time points of measurements counted by the numbers of days after the seeds were t=(5 8 13 18 21 26 32 39).

We select the Legendre polynomial of order 2 to model changes of population mean and genetic effects with growth time on the basis of the changing law of phenotypes of trait. The data are analyzed by adopting the maximum likelihood method (Yang et al., 2006) and Bayesian method, respectively.

Before Bayesian sampling, we partitioned each chromosome with a 1-cM grid, which resulted in 1214 possible loci across the genome. The actual values for the hyper parameters are Sa=Se=0.5I, νa=p+1 and νe=0. The initial values of all variables were sampled from their prior distributions. For all Bayesian analyses, the MCMC sampling ran for 200 000 cycles after discarding the first 2000 burn-ins. The chain was thinned by recording one sample in every 40 samples, yielding 5000 samples for posterior Bayesian analysis.

With interval mapping based on maximum likelihood (Yang et al., 2006), the five significant QTLs were detected on chromosomes 1, 5, 9, 10 and 12, respectively. Under the nonepistatic analysis, the number of significant QTLs detected in interval mapping was taken as the prior number of main-effect QTLs, and the upper bound of the number of QTLs was then calculated as L=5+3√5=12. The graph of the BFs is displayed as the bottom plot in Figure 1. It can be seen that besides five QTLs identified by interval mapping, four more clear peaks arise on chromosomes 2, 3, 4 and 7. Moreover, all relative BFs of the nine peaks found above are greater than the significant threshold of 3.

Figure 1
Profiles of Bayes factor with Bayesian nonepistatic analysis. Chromosomes are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.

The epistatic analysis also took the expected number of main-effect QTLs to 5, as nonepistatic analysis did, and the expected number of all QTLs was chosen as 8. The maximum number of QTLs was then L=8+3√8=16.

The estimated population mean and covariance matrix for random regression coefficients for the individual-specific environmental effects are

equation image

and

equation image

respectively. The estimated residual variance is [sigma with hat]2=0.0083.

The profiles of the BF for each locus across the genome are depicted in the top plot in Figure 2. Compared with the relative profiles in Figure 1, 12 peaks can be found, including the 9 loci detected by nonepistatic analysis. Except for the peak on chromosome 11, others show strong evidence for the presence of QTLs.

Figure 2
Profiles of Bayes factor with Bayesian epistatic analysis. Chromosomes are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.

As shown in Figure 3, Bayesian epistatic analysis found that four pairs of QTLs on chromosomes 1, 2, 3 and 4 perform strong interactions, and that the QTL pair on chromosomes 3 and 10 and the one on chromosomes 4 and 8 have relatively high BF values, but the interactions are nonsignificant. Note that the fourth QTL on chromosome 3 and the eighth QTL on chromosome 8 are not found in nonepistatic analysis. Hence, we infer that the fourth and eighth QTLs are detected in epistatic analysis, mainly because of epistatic interactions.

Figure 3
The two-dimensional profile of Bayes factors for epistatic effects on the selected chromosomes.

Estimates for main-effect and for epistatic-effect QTL parameters, including QTL positions, regression effects and BFs, are shown in Tables 1 and and2,2, respectively. To illustrate the effects of QTLs on dynamic traits, we depict the changes in the main effects of 12 QTLs with measurement time in Figure 4. These curves are combined onto the three groups: convex (above), concave (middle) and linear (below) ones. We find that the 10th QTL and the 12th QTL on dynamic traits have strong influences on the change in direct and inverse proportion, respectively, with growth time, whereas the effects of other QTLs do not result in distinct changes.

Figure 4
Changes in main effects of QTL detected with time. The above, middle and below are the convex, concave and linear groups, respectively.
Table 1
Estimates for regression effects of main-effect QTLs detected with Bayesian epistatic mapping analysis
Table 2
Estimates for regression effects of epistatic QTLs identified with Bayesian epistatic mapping analysis

Simulation

We simulated a dynamic trait measured at eight time points for 150 or 300 BC individuals. A genome consisting of a single large chromosome of 600 cM was simulated, which was covered by 61 evenly placed markers. The growth pattern of the dynamic trait was assumed to be controlled by the four additive QTLs and two pairs of epistatic QTLs with their positions and effects listed in Table 3. The order of the polynomial was set at 3, which generated the ‘S' shape growth trajectory for phenotypes. The dynamic trait is measured at the same 8 time points as in real data. The simulated population mean was μ=[45 44 −1 −7]T, covariance matrix for individual-specific environmental error was

equation image

and the residual variance was taken at 4.0.

Table 3
The regression effects of additive and epistatic QTLs simulated

In all analyses for simulated data, we set the prior number of main-effect QTLs at 4 and the prior expected number of epistatic QTLs at 2. The upper bound of the number of QTLs was then L=6+3√6=13. The actual values for the hyperparameters used here take the same values as in real data analyses. The initial values of all variables were sampled from their prior distributions. The MCMC is run for 10 000 cycles as a burn-in period (deleted) and then for an additional 150 000 cycles after the burn-in. The chain is then thinned to reduce serial correlation by saving one observation in every 50 cycles. The posterior sample contained 3000 observations for the post-MCMC analysis. Note that here the length of the burn-in is judged by visually inspecting the plots of some posterior samples across rounds and is set to enough cycles for ensuring the MCMC convergence. The simulation experiment is replicated 40 times for evaluating the statistical power of our proposed method. The statistical power is calculated as the percentage of the number of those simulations in which significant QTL is detected.

The purpose of the simulation is to show the performance of the method proposed herein in simultaneously detecting main-effect and epistatic QTLs under different sample sizes. Therefore, we do not compare our approach with other methods for only mapping main-effect QTLs, such as the maximum likelihood approach. Table 4 shows the estimates for regression effects of the given QTLs in Table 3 and the relative statistical power of QTL detection. Apparently, Bayesian mapping of genome-wide interacting loci for dynamic traits is able to accurately estimate the regression effects of QTLs detected. Furthermore, the estimation precision of parameters and statistical power of QTL detection, as expected, improve with the increasing effect or genetic contribution proportion of QTL and increasing sample sizes. In addition, we find that the Bayesian model selection for mapping QTLs of dynamic traits is sensitive to QTLs with a relatively small genetic effect, compared with the mapping results of QTLs with the same regression effects but a lower residual variance in Yang and Xu (2007).

Table 4
Mean estimates and s.d. (in parentheses) of QTL regression effects and statistical power of QTL detection

Discussion

By assigning a maximum number of detectable QTLs and using latent binary variables to indicate which main and epistatic effects of putative QTLs are included in or excluded from the model, Yi et al. (2005) first applied a Bayesian model selection method to identify epistatic QTLs in experimental crosses. The approach allows MCMC sampling for QTL parameters to be carried out in the reduced model space, enhancing the computational efficiency of Bayesian mapping many epistatic QTLs. Subsequently, Yi et al. (2007a) extended a Bayesian model selection method for a single continuous trait to an ordinal trait. In this study, we adopt a multivariate version of the Bayesian model selection method to map epistatic QTL for dynamic traits. By pre-estimating indicator variables of putative QTL genotypes and exploring the posterior for indicator variables of genetic effects (Yi et al., 2007b), the Bayesian mapping method can fairly quickly identify interacting QTLs for dynamic traits in models with large numbers of genetic effects.

Generally, there are three types of epistatic interaction between QTLs: (1) where both QTLs are the main effect; (2) where both QTLs are not the main effect and (3) where only one QTL is the main effect. In mapping practice, Bayesian model selection can sensitively detect them by regulating dependence priors on genetic architecture indicators (Yi et al., 2007a, 2007b). However, the epistatic QTLs for leaf age growth are found only between main-effect QTLs in our real data analysis.

In fact, the orders of polynomials for all effects in model (1) are unknown. We can only determine the order of polynomial for the population mean according to the shape of phenotypic trajectories of dynamic traits. In implementing our proposed method, we simple chose the Legendre polynomial functions of the same order as for population mean to fit change in QTL genetic effects and time-dependent environmental effects with time. The shape of the population mean or each effect depends on different estimates for corresponding polynomial regression coefficients. Naturally, one would ask whether the order of the Legendre polynomial for each effect is indeed the same. The choice for each submodel in model (1) will be required to answer the issue. We may first choose the highest possible order and use it for all QTL effects and time-dependent environmental effects. For each QTL effect, we then take each regression coefficient in the nested polynomial to a different indicator variable and infer the significance of these regression coefficients by calculating the related BF value in post-MCMC analysis. For time-dependent environmental effects, however, it is difficult to infer many individual-specific regression coefficients as for QTL effects because of the large number of regression coefficients. In this case, we can adopt Bayesian model selection for random covariance matrix in mixed model (Chen and Dunson, 2003; Kinney and Dunson, 2007) to determine the order of the Legendre polynomial for time-dependent environmental effects. Once some appropriate submodels are chosen for the population mean, all QTL effects and time-dependent environmental effects by using the described procedures above and the optimal multiple interacting QTL model for dynamic traits will be established. In choosing the submodel of each QTL effect and Bayesian model selection for the random covariance matrix, the priors and posteriors for many new unknown variables need to be specified and deduced under multiple interacting QTL models for dynamic traits. These are being implemented in our research plan.

In addition, how to model residuals is also a noticeable question. Functional mapping recommended a parametric residual covariance structure by using the time series autocorrelation structure. The autoregressive model with order 1 [AR(1)] and one unknown parameter is often used in functional mapping. However, there appears to be no efficient way to sample the autoregressive coefficient in a covariance matrix within the Bayesian framework. Our investigation found that the specifying uniform distribution as a prior for autoregressive coefficient and the sampling method proposed by Gianola et al., 2003 do not work in Bayesian functional mapping. In fact, the covariance structure described by ψTΣψ+2 is more flexible than the parametric structure because we can actually choose a different degree of the polynomial to fit a covariance structure with a different degree of complexity. Moreover, we can easily sample the covariance matrix Σ from a closed form of marginal posterior distribution.

The multiple interacting QTL model for dynamic traits proposed herein can be treated as a general form of the model for analyzing the genetic architecture of continuous traits. For instance, letting ψ=1 and ξi=0 in scale, that is, only one measurement on each individual, leads to multiple interacting QTL models for single continuous quantitative traits; taking ψ to an identity matrix of m order and ξi to a zero vector results in a multiple interacting QTL model for multiple continuous quantitative traits; and ff ξi is assigned to nonzero in the two cases above. The multiple interacting QTL models for a single continuous quantitative trait and multiple continuous quantitative traits are also able to make use of repeat records on the phenotypes. Corresponding Bayesian model selection approaches can be likewise obtained by taking ψ and ξi to different values or matrices.

Acknowledgments

The preparation of the manuscript was supported by the Chinese National Natural Science Foundation Grant 30972077 to RY.

Appendix A

Posterior distributions for unknown parameters

The marginal posterior distribution of μ, given all other parameters, is a multivariate normal with the mean

equation image

and the covariance matrix (nψV−1ψT)−1.

The marginal posterior distribution of βj is also a normal, of which the mean is

equation image

and the covariance matrix is

equation image

Likewise, the marginal posterior distribution of δjk can be expressed as a normal distribution with a mean

equation image

and the covariance matrix

equation image

The marginal posterior distribution of ξi subjects to normal distribution with a mean

equation image

and a covariance matrix

equation image

where the marginal posterior distribution of Σ is An external file that holds a picture, illustration, etc.
Object name is hdy201020e26.jpg

For the residual variance σ02, the corresponding marginal posterior distribution is a scaled inverse χ2 with parameters νe+n and An external file that holds a picture, illustration, etc.
Object name is hdy201020e27.jpgwhere ei=yiMiψTξi.

The marginal posterior distribution of γ is a Bernoulli with a probability

equation image

where, w=wm and An external file that holds a picture, illustration, etc.
Object name is hdy201020e29.jpg (j=1, 2,…,p) for the additive; w=we and An external file that holds a picture, illustration, etc.
Object name is hdy201020e30.jpg (j=1, 2,…, q; k=j+1, j+2,…, q) for the epistatic. The Metropolis–Hastings algorithm is also used to sample γ with acceptance rate An external file that holds a picture, illustration, etc.
Object name is hdy201020e31.jpg

All aforementioned parameters have explicit forms so that samples can be directly drawn from their corresponding distributions by adopting the Gibbs sampler algorithm. The parameters without closed conditional posterior distribution forms, such as λ and X, will be sampled by using the Metropolis–Hastings algorithm. We sample QTL positions in L variable intervals whose boundaries are the positions of adjoining QTLs and restrict the minimal distance between two QTLs to be 5 cM. The Metropolis–Hastings algorithm is required to calculate an acceptance rule for accepting the proposed value over the current value. A detailed formula of the MH acceptance rule can be found for λ and X in Yang and Xu (2007).

Appendix B

Simplification of the inverse and determinant for V

According to the formula proved by Henderson et al. (1959)

equation image

if we let R=2, Z=ψ and D=S, then the inverse of V can be simplified as

equation image

For the determinant of V,

equation image

Apparently, only the inverse and determinant for p+1 order matrices are required to be calculated in solving the inverse and determinant of V.

Notes

The authors declare no conflict of interest.

References

  • Ball RD. Bayesian methods for quantitative trait loci mapping based on model selection: approximate analysis using the Bayesian information criterion. Genetics. 2001;159:1351–1364. [PMC free article] [PubMed]
  • Chen Z, Dunson DB. Random effects selection in linear mixed models. Biometrics. 2003;59:762–769. [PubMed]
  • Cheverud JM, Rutledge JJ, Atchley WR. Quantitative genetics of development, genetic correlations among age-specific trait values and the evolution of ontogeny. Evolution. 1983;37:895–905.
  • Carlin BP, Chib S. Bayesian model choice via Markov chain Monte Carlo. J Am Stat Assoc. 1995;88:881–889.
  • Eaves LJ, Neale MC, Maes H. Multivariate multipoint linkage analysis of quantitative trait loci. Behav Genet. 1996;26:519–525. [PubMed]
  • Emebiri LC, Devey ME, Matheson AC, Slee MU. Age-related changes in the expression of QTLs for growth in radiata pine seedlings. Theor Appl Genet. 1998;97:1053–1061.
  • Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. Chapman & Hall: New York; 1995.
  • Gianola D, Perez-Enciso M, Toro MA. On marker-assisted prediction of genetic value: Beyond the ridge. Genetics. 2003;163:347–365. [PMC free article] [PubMed]
  • Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57:97–109.
  • Henderson CR, Kempthorne O, Searle SR, von Krosigk CM. The estimation of environmental and genetic trends from records subject to culling. Biometrics. 1959;15:192–218.
  • Huang SQ, Cui Y, Yang R. Functional mapping of dynamic traits with Legendre polynomial. Prog Nat Sci. 2005;10:1183–1188.
  • Jiang C, Zeng ZB. Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics. 1995;140:1111–1127. [PMC free article] [PubMed]
  • Kao CH, Zeng ZB. Modeling epistasis of quantitative trait loci using Cockerham's model. Genetics. 2002;160:1243–1261. [PMC free article] [PubMed]
  • Kass RE, Raftery AE. Bayes factors. J Am Stat Assoc. 1995;90:773–795.
  • Kinney SK, Dunson DB. Fixed and random effects selection in linear and logistic models. Biometrics. 2007;63:690–698. [PubMed]
  • Kirkpatrick M, Heckman N. A quantitative genetic model for growth, shape, reaction norms, and other infinite-dimensional characters. J Math Biol. 1989;27:429–450. [PubMed]
  • Kirkpatrick M, Lofsvold D, Bulmer M. Analysis of the inheritance, selection and evolution of growth trajectories. Genetics. 1990;124:979–993. [PMC free article] [PubMed]
  • Knott SA, Haley CS. Multitrait least squares for quantitative trait loci detection. Genetics. 2000;156:899–911. [PMC free article] [PubMed]
  • Kohn R, Smith M, Chan D. Nonparametric regression using linear combinations of basis functions. Stat Comput. 2001;11:313–322.
  • Korol AB, Ronin YI, Kirzhner VM. Interval mapping of quantitative trait loci employing correlated trait complexes. Genetics. 1995;140:1137–1147. [PMC free article] [PubMed]
  • Ma CX, Casella G, Wu RL. Functional mapping of quantitative trait loci underlying the character process: a theoretical framework. Genetics. 2002;61:1751–1762. [PMC free article] [PubMed]
  • Macgregor S, Knott SA, White I, Visscher PM. Quantitative trait locus analysis of longitudinal quantitative trait data in complex pedigrees. Genetics. 2005;171:1365–1376. [PMC free article] [PubMed]
  • Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21:1087–1092.
  • Nuzhdin SV, Pasyukova EG, Dilda CL, Zeng ZB, Mackay TFC. Sex-specific quantitative trait loci affecting longevity in Drosophila melanogaster. Proc Natl Acad Sci USA. 1997;94:9734–9739. [PMC free article] [PubMed]
  • Plummer M, Best N, Cowles K, Vines K. CODA: convergence diagnosis and output analysis for MCMC. R News. 2006;6:7–10.
  • Raftery AE, Madigan D, Hoeting JA. Bayesian model averaging for linear regression models. J Am Stat Assoc. 1997;92:179–191.
  • Ronin YI, Kirzhner VM, Korol AB. Linkage between loci of quantitative traits and marker loci: multi-trait analysis with a single marker. Theor Appl Genet. 1995;90:776–786. [PubMed]
  • Schaeffer LR. Application of random regression models in animal breeding. Livest Prod Sci. 2004;86:35–45.
  • Sillanpää MJ, Corander J. Model choice in gene mapping: what and why. Trends Genet. 2002;18:301–307. [PubMed]
  • Verhaegen D, Plomion C, Gion JM, Poitel M, Costa P, Kremer A. Quantitative trait dissection analysis in Eucalyptus using RADP markers: 1. Detection of QTL in interspecific hybrid progeny, stability of QTL expression across different ages. Theor Appl Genet. 1997;95:597–608.
  • Wang H, Zhang YM, Li X, Masinde GL, Mohan S, Baylink DJ, et al. Bayesian shrinkage estimation of quantitative trait loci parameters. Genetics. 2005;170:465–480. [PMC free article] [PubMed]
  • Weng Q, Wu W, Li W, Liu H, Tang D, Zhou Y, et al. Construction of an RFLP linkage map of rice using DNA probes from two different sources. J Fujian Agric Univ. 2000;29:129–133.
  • Wu R, Lin M. Opinion: functional mapping—how to map and study the genetic architecture of dynamic complex traits. Nat Rev Gen. 2006;7:229–237. [PubMed]
  • Wu R, Ma CX, Lin M, Casella G. A general framework for analyzing the genetic architecture of developmental characteristics. Genetics. 2004a;166:1541–1551. [PMC free article] [PubMed]
  • Wu R, Ma CX, Lin M, Wang Z, Casella G. Functional mapping of quantitative trait loci underlying growth trajectories using a transform-both-sides logistic model. Biometrics. 2004b;60:729–738. [PubMed]
  • Wu R, Ma CX, Zhu J, Casella G. Mapping epigenetic quantitative trait loci (QTL) altering a developmental trajectory. Genome. 2002;45:28–33. [PubMed]
  • Wu R, Wang Z, Zhao W, Cheverud JM. A mechanistic model for genetic machinery of ontogenetic growth. Genetics. 2004c;168:2383–2394. [PMC free article] [PubMed]
  • Wu WR, Li WM, Tang DZ, Lu HR, Worland AJ. Time-related mapping of quantitative trait loci underlying tiller number in rice. Genetics. 1999;151:297–303. [PMC free article] [PubMed]
  • Yan J, Zhu J, He C, Benmoussa M, Wu P. Molecular dissection of developmental behavior of plant height in rice (Oryza sativa L.) Genetics. 1998a;150:1257–1265. [PMC free article] [PubMed]
  • Yan JQ, Zhu J, He CX, Benmoussa M, Wu P. Quantitative trait loci analysis for the developmental behavior of tiller number in rice (Oryza sativa L.) Theor Appl Genet. 1998b;97:267–274.
  • Yang R, Gao H, Wang X, Zhang J, Zeng ZB, Wu R. A semiparametric approach for composite functional mapping of dynamic quantitative traits. Genetics. 2007;177:1859–1870. [PMC free article] [PubMed]
  • Yang R, Tian Q, Xu S. Mapping quantitative trait loci for longitudinal traits in line crosses. Genetics. 2006;173:2339–2356. [PMC free article] [PubMed]
  • Yang R, Xu S. Bayesian shrinkage analysis of quantitative trait loci for dynamic traits. Genetics. 2007;176:1169–1185. [PMC free article] [PubMed]
  • Yang RQ, Gao HJ, Sun H, Xu S. Maximum likelihood analysis for mapping dynamic trait QTL in outbred population I. Methodology. Acta Genet Sin. 2004;31:1116–1122. [PubMed]
  • Yi N. A unified Markov chain Monte Carlo framework for mapping multiple quantitative trait loci. Genetics. 2004;167:967–975. [PMC free article] [PubMed]
  • Yi N, Banerjee S, Pomp D, Yandell BS. Bayesian mapping of genomewide interacting quantitative trait loci for ordinal traits. Genetics. 2007a;176:1855–1864. [PMC free article] [PubMed]
  • Yi N, Shriner D, Banerjee S, Mehta T, Pomp D, Yandell BS. An efficient Bayesian model selection approach for interacting quantitative trait loci models with many effects. Genetics. 2007b;176:1865–1877. [PMC free article] [PubMed]
  • Yi N, Yandell BS, Churchill GA, Allison DB, Eisen EJ, Pomp D. Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics. 2005;170:1333–1344. [PMC free article] [PubMed]
  • Zhou Y, Li W, Wu W, Chen Q, Mao D, Worland AJ. Genetic dissection of heading time and its components in rice. Theor Appl Genet. 2001;102:1236–1242.

Articles from Heredity are provided here courtesy of Nature Publishing Group
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...