Format

Send to

Choose Destination
Genet Epidemiol. 2014 Nov;38(7):622-637. doi: 10.1002/gepi.21840. Epub 2014 Sep 9.

Generalized functional linear models for gene-based case-control association studies.

Author information

1
Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research Eunice Kennedy Shriver National Institute of Child Health and Human Development National Institutes of Health, Rockville, MD 20852.
2
Epidemiology Branch, Division of Intramural Population Health Research Eunice Kennedy Shriver National Institute of Child Health and Human Development National Institutes of Health, Rockville, MD 20852.
3
Center for Human Genetics, Marshfield Clinic, Marshfield, WI 54449.
4
Department of Neurology, School of Medicine University of California, San Francisco, CA 94185.
5
Statistical Genetics Section, Computational and Statistical Genomics Branch National Human Genome Research Institute National Institutes of Health, Bethesda, MD 20892.
6
Departments of Human Genetics and Biostatistics, Graduate School of Public Health University of Pittsburgh, Pittsburgh, PA 15261.
7
Human Genetics Center, University of Texas - Houston P.O. Box 20334, Houston, Texas 77225.
#
Contributed equally

Abstract

By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses.

KEYWORDS:

case-control association studies; common variants; complex diseases; functional data analysis; generalized functional linear models; logistic regression; rare variants

PMID:
25203683
PMCID:
PMC4189986
DOI:
10.1002/gepi.21840
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Wiley Icon for PubMed Central
Loading ...
Support Center