Format

Send to

Choose Destination
Genet Sel Evol. 2015 Mar 7;47:13. doi: 10.1186/s12711-015-0092-x.

Improving the computational efficiency of fully Bayes inference and assessing the effect of misspecification of hyperparameters in whole-genome prediction models.

Author information

1
Department of Animal Science, Michigan State University, East Lansing, MI, 48824-1225, USA. mishelywz@gmail.com.
2
Department of Animal Science, Michigan State University, East Lansing, MI, 48824-1225, USA. chench57@msu.edu.
3
Department of Animal Science, Michigan State University, East Lansing, MI, 48824-1225, USA. tempelma@msu.edu.

Abstract

BACKGROUND:

The reliability of whole-genome prediction models (WGP) based on using high-density single nucleotide polymorphism (SNP) panels critically depends on proper specification of key hyperparameters. A currently popular WGP model labeled BayesB specifies a hyperparameter π, that is `loosely used to describe the proportion of SNPs that are in linkage disequilibrium (LD) with causal variants. The remaining markers are specified to be random draws from a Student t distribution with key hyperparameters being degrees of freedom v and scale s(2).

METHODS:

We consider three alternative Markov chain Monte Carlo (MCMC) approaches based on the use of Metropolis-Hastings (MH) to estimate these key hyperparameters. The first approach, termed DFMH, is based on a previously published strategy for which s(2) is drawn by a Gibbs step and v is drawn by a MH step. The second strategy, termed UNIMH, substitutes MH for Gibbs when drawing s(2) and further collapses or marginalizes the full conditional density of v. The third strategy, termed BIVMH, is based on jointly drawing the two hyperparameters in a bivariate MH step. We also tested the effect of misspecification of s(2) for its effect on accuracy of genomic estimated breeding values (GEBV), yet allowing for inference on the other hyperparameters.

RESULTS:

The UNIMH and BIVMH strategies had significantly greater (P < 0.05) computational efficiencies for estimating v and s(2) than DFMH in BayesA (π = 1) and BayesB implementations. We drew similar conclusions based on an analysis of the public domain heterogeneous stock mice data. We also determined significant drops (P < 0.01) in accuracies of GEBV under BayesA by overspecifying s(2), whereas BayesB was more robust to such misspecifications. However, understating s(2) was compensated by counterbalancing inferences on v in BayesA and BayesB, and on π in BayesB.

CONCLUSIONS:

Sampling strategies based solely on MH updates of v and s(2), and collapsed representations of full conditional densities can improve the computational efficiency of MCMC relative to the use of Gibbs updates. We believe that proper inferences on s(2), v and π are vital to ensure that the accuracy of GEBV is maximized when using parametric WGP models.

PMID:
25885894
PMCID:
PMC4351701
DOI:
10.1186/s12711-015-0092-x
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center