Statistical inference of genetic pathway analysis in high dimensions

Biometrika. 2019 Sep;106(3):651. doi: 10.1093/biomet/asz033. Epub 2019 Jul 13.

Abstract

Genetic pathway analysis has become an important tool for investigating the association between a group of genetic variants and traits. With dense genotyping and extensive imputation, the number of genetic variants in biological pathways has increased considerably and sometimes exceeds the sample size [Formula: see text]. Conducting genetic pathway analysis and statistical inference in such settings is challenging. We introduce an approach that can handle pathways whose dimension [Formula: see text] could be greater than [Formula: see text]. Our method can be used to detect pathways that have nonsparse weak signals, as well as pathways that have sparse but stronger signals. We establish the asymptotic distribution for the proposed statistic and conduct theoretical analysis on its power. Simulation studies show that our test has correct Type I error control and is more powerful than existing approaches. An application to a genome-wide association study of high-density lipoproteins demonstrates the proposed approach.

Keywords: Genetic pathway analysis; Genetic variant; High-dimensional inference; Nonsparse signal; Power analysis; Sparse signal.