Format

Send to

Choose Destination
PLoS Genet. 2015 Apr 7;11(4):e1004969. doi: 10.1371/journal.pgen.1004969. eCollection 2015 Apr.

Simultaneous discovery, estimation and prediction analysis of complex traits using a bayesian mixture model.

Author information

1
Queensland Brain Institute, University of Queensland, Brisbane, Australia.
2
Department of Primary Industries, Biosciences Research Division, Bundoora, Australia; Dairy Futures Cooperative Research Centre, Bundoora, Australia.
3
Department of Primary Industries, Biosciences Research Division, Bundoora, Australia; Faculty of Land and Food Resources, University of Melbourne, Melbourne, Australia.
4
Queensland Brain Institute, University of Queensland, Brisbane, Australia; University of Queensland Diamantina Institute, University of Queensland, Translational Research Institute (TRI), Brisbane, Australia.

Abstract

Gene discovery, estimation of heritability captured by SNP arrays, inference on genetic architecture and prediction analyses of complex traits are usually performed using different statistical models and methods, leading to inefficiency and loss of power. Here we use a Bayesian mixture model that simultaneously allows variant discovery, estimation of genetic variance explained by all variants and prediction of unobserved phenotypes in new samples. We apply the method to simulated data of quantitative traits and Welcome Trust Case Control Consortium (WTCCC) data on disease and show that it provides accurate estimates of SNP-based heritability, produces unbiased estimators of risk in new samples, and that it can estimate genetic architecture by partitioning variation across hundreds to thousands of SNPs. We estimated that, depending on the trait, 2,633 to 9,411 SNPs explain all of the SNP-based heritability in the WTCCC diseases. The majority of those SNPs (>96%) had small effects, confirming a substantial polygenic component to common diseases. The proportion of the SNP-based variance explained by large effects (each SNP explaining 1% of the variance) varied markedly between diseases, ranging from almost zero for bipolar disorder to 72% for type 1 diabetes. Prediction analyses demonstrate that for diseases with major loci, such as type 1 diabetes and rheumatoid arthritis, Bayesian methods outperform profile scoring or mixed model approaches.

PMID:
25849665
PMCID:
PMC4388571
DOI:
10.1371/journal.pgen.1004969
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center