Format

Send to

Choose Destination
Nat Genet. 2019 Dec;51(12):1749-1755. doi: 10.1038/s41588-019-0530-8. Epub 2019 Nov 25.

A resource-efficient tool for mixed model association analysis of large-scale data.

Author information

1
Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
2
Institute for Advanced Research, Wenzhou Medical University, Wenzhou, Zhejiang, China.
3
Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, Australia.
4
Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia. jian.yang.qt@gmail.com.
5
Institute for Advanced Research, Wenzhou Medical University, Wenzhou, Zhejiang, China. jian.yang.qt@gmail.com.

Abstract

The genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test statistics and hence to spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we develop an MLM-based tool (fastGWA) that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrate by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then apply fastGWA to 2,173 traits on array-genotyped and imputed samples from 456,422 individuals and to 2,048 traits on whole-exome-sequenced samples from 46,191 individuals in the UKB.

PMID:
31768069
DOI:
10.1038/s41588-019-0530-8
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Nature Publishing Group
Loading ...
Support Center