Format

Send to

Choose Destination
PLoS One. 2017 Mar 16;12(3):e0173997. doi: 10.1371/journal.pone.0173997. eCollection 2017.

A comprehensive survey of genetic variation in 20,691 subjects from four large cohorts.

Author information

1
Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America.
2
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America.
3
Department of Epidemiology, University of Washington, Seattle, WA, United States of America.
4
Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, MA, United States of America.
5
Gastrointestinal Unit, Massachusetts General Hospital, Boston, MA, United States of America.
6
Section of Rheumatology and Clinical Epidemiology Unit, Boston University School of Medicine, Boston, MA, United States of America.
7
Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, United States of America.
8
Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America.
9
Renal Division, Department of Medicine, Brigham and Women's Hospital, Boston, MA, United States of America.
10
Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, United States of America.
11
Division of Aging, Department of Medicine, Brigham and Women's Hospital, Boston, MA, United States of America.
12
Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, United States of America.
13
Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America.
14
Department of Emergency Medicine, Center for Vascular Emergencies, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America.
15
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America.

Abstract

The Nurses' Health Study (NHS), Nurses' Health Study II (NHSII), Health Professionals Follow Up Study (HPFS) and the Physicians Health Study (PHS) have collected detailed longitudinal data on multiple exposures and traits for approximately 310,000 study participants over the last 35 years. Over 160,000 study participants across the cohorts have donated a DNA sample and to date, 20,691 subjects have been genotyped as part of genome-wide association studies (GWAS) of twelve primary outcomes. However, these studies utilized six different GWAS arrays making it difficult to conduct analyses of secondary phenotypes or share controls across studies. To allow for secondary analyses of these data, we have created three new datasets merged by platform family and performed imputation using a common reference panel, the 1,000 Genomes Phase I release. Here, we describe the methodology behind the data merging and imputation and present imputation quality statistics and association results from two GWAS of secondary phenotypes (body mass index (BMI) and venous thromboembolism (VTE)). We observed the strongest BMI association for the FTO SNP rs55872725 (β = 0.45, p = 3.48x10-22), and using a significance level of p = 0.05, we replicated 19 out of 32 known BMI SNPs. For VTE, we observed the strongest association for the rs2040445 SNP (OR = 2.17, 95% CI: 1.79-2.63, p = 2.70x10-15), located downstream of F5 and also observed significant associations for the known ABO and F11 regions. This pooled resource can be used to maximize power in GWAS of phenotypes collected across the cohorts and for studying gene-environment interactions as well as rare phenotypes and genotypes.

PMID:
28301549
PMCID:
PMC5354293
DOI:
10.1371/journal.pone.0173997
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center