Format

Send to

Choose Destination
Eur J Hum Genet. 2014 Sep;22(9):1137-44. doi: 10.1038/ejhg.2013.297. Epub 2014 Jan 8.

Analysis of rare variant population structure in Europeans explains differential stratification of gene-based tests.

Author information

1
Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
2
Department of Biology, University of Fribourg, Fribourg, Switzerland.
3
Quantitative Sciences, GlaxoSmithKline, Research Triangle Park, NC, USA.
4
Department of Human Genetics, University of Chicago, Chicago, IL, USA.
5
1] Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA [2] Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA.

Abstract

There is substantial interest in the role of rare genetic variants in the etiology of complex human diseases. Several gene-based tests have been developed to simultaneously analyze multiple rare variants for association with phenotypic traits. The tests can largely be partitioned into two classes - 'burden' tests and 'joint' tests - based on how they accumulate evidence of association across sites. We used the empirical joint site frequency spectra of rare, nonsynonymous variation from a large multi-population sequencing study to explore the effect of realistic rare variant population structure on gene-based tests. We observed an important difference between the two test classes: their susceptibility to population stratification. Focusing on European samples, we found that joint tests, which allow variants to have opposite directions of effect, consistently showed higher levels of P-value inflation than burden tests. We determined that the differential stratification was caused by two specific patterns in the interpopulation distribution of rare variants, each correlating with inflation in one of the test classes. The pattern that inflates joint tests is more prevalent in real data, explaining the higher levels of inflation in these tests. Furthermore, we show that the different sources of inflation between tests lead to heterogeneous responses to genomic control correction and the number of variants analyzed. Our results indicate that care must be taken when interpreting joint and burden analyses of the same set of rare variants, in particular, to avoid mistaking inflated P-values in joint tests for stronger signals of true associations.

PMID:
24398795
PMCID:
PMC4135410
DOI:
10.1038/ejhg.2013.297
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center