Format

Send to

Choose Destination
Nat Commun. 2019 Nov 8;10(1):5086. doi: 10.1038/s41467-019-12653-0.

Improved polygenic prediction by Bayesian multiple regression on summary statistics.

Author information

1
Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia. luke.lloydjones@uqconnect.edu.au.
2
Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia. j.zeng@uq.edu.au.
3
Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia.
4
Estonian Genome Center, Institute of Genomics, University of Tartu, Riia 23b, 51010, Tartu, Estonia.
5
School of Engineering and Technology, Central Queensland University, Rockhampton, 4702, QLD, Australia.
6
Australian Agricultural Company Ltd, Brisbane, 4006, QLD, Australia.
7
Institute of Molecular and Cell Biology, University of Tartu, 51010, Tartu, Estonia.
8
Queensland Brain Institute, University of Queensland, Brisbane, 4072, QLD, Australia.
9
Faculty of Veterinary and Agricultural Science, University of Melbourne, Melbourne, 3052, VIC, Australia.
10
Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia. jian.yang.qt@gmail.com.
11
Institute for Advanced Research, Wenzhou Medical University, Wenzhou, 325027, Zhejiang, China. jian.yang.qt@gmail.com.
12
Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia. peter.visscher@uq.edu.au.

Abstract

Accurate prediction of an individual's phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding.

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center