Format

Send to

Choose Destination
Am J Hum Genet. 2019 Feb 7;104(2):260-274. doi: 10.1016/j.ajhg.2018.12.012. Epub 2019 Jan 10.

Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies.

Author information

1
Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Center for Precision Health, School of Public Health and School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
2
Center for Population Genomics, VA Boston Healthcare System, Jamaica Plain, MA 02130, USA.
3
Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101, USA.
4
Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430030, China.
5
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
6
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
7
Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.
8
Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115, USA; Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115, USA.
9
Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
10
Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX 78520, USA.
11
Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO 80206, USA.
12
Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
13
Jackson Heart Study, Department of Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USA.
14
Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
15
Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06510, USA; Olin Neuropsychiatric Research Center, Institute of Living, Hartford Hospital, Hartford, CT 06106, USA.
16
The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA 90502, USA.
17
Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA 01702, USA.
18
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
19
Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA.
20
USF Genomics, College of Public Health, University of South Florida, Tampa, FL 33612, USA.
21
Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA.
22
Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA; Geriatrics Research and Education Clinical Center, Baltimore VA Medical Center, Baltimore, MD 21201, USA.
23
Division of Cardiology, Johns Hopkins University, Baltimore, MD 21287, USA.
24
Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA.
25
Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA 01702, USA; Sections of Preventive Medicine and Epidemiology, and of Cardiology, Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA; Department of Epidemiology, Boston University School of Public Health, Boston, MA 02118, USA.
26
Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS 39216, USA.
27
Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115, USA; Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02115, USA.
28
Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101, USA; Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, USA; Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, WA 98108, USA; Department of Epidemiology, University of Washington, Seattle, WA 98195, USA.
29
Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA.
30
Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA 01702, USA; Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA.
31
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Statistics, Harvard University, Cambridge, MA 02138, USA. Electronic address: xlin@hsph.harvard.edu.

Abstract

With advances in whole-genome sequencing (WGS) technology, more advanced statistical methods for testing genetic association with rare variants are being developed. Methods in which variants are grouped for analysis are also known as variant-set, gene-based, and aggregate unit tests. The burden test and sequence kernel association test (SKAT) are two widely used variant-set tests, which were originally developed for samples of unrelated individuals and later have been extended to family data with known pedigree structures. However, computationally efficient and powerful variant-set tests are needed to make analyses tractable in large-scale WGS studies with complex study samples. In this paper, we propose the variant-set mixed model association tests (SMMAT) for continuous and binary traits using the generalized linear mixed model framework. These tests can be applied to large-scale WGS studies involving samples with population structure and relatedness, such as in the National Heart, Lung, and Blood Institute's Trans-Omics for Precision Medicine (TOPMed) program. SMMATs share the same null model for different variant sets, and a virtue of this null model, which includes covariates only, is that it needs to be fit only once for all tests in each genome-wide analysis. Simulation studies show that all the proposed SMMATs correctly control type I error rates for both continuous and binary traits in the presence of population structure and relatedness. We also illustrate our tests in a real data example of analysis of plasma fibrinogen levels in the TOPMed program (n = 23,763), using the Analysis Commons, a cloud-based computing platform.

KEYWORDS:

TOPMed; generalized linear mixed model; population structure; rare variants; relatedness; variant set association test; whole-genome sequencing

PMID:
30639324
DOI:
10.1016/j.ajhg.2018.12.012

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center