Format

Send to

Choose Destination
Genet Epidemiol. 2020 Feb 11. doi: 10.1002/gepi.22283. [Epub ahead of print]

Incorporating external information to improve sparse signal detection in rare-variant gene-set-based analyses.

Author information

1
Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina.
2
Center for Genomic and Computational Biology, Duke University, Durham, North Carolina.
3
Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina.
4
Institute of Genomic Medicine, Columbia University, New York City, New York.
5
Department of Neurology, Columbia University, New York City, New York.
6
Center for Motor Neuron Biology and Disease, Columbia University, New York City, New York.

Abstract

Gene-set analyses are used to assess whether there is any evidence of association with disease among a set of biologically related genes. Such an analysis typically treats all genes within the sets similarly, even though there is substantial, external, information concerning the likely importance of each gene within each set. For example, for traits that are under purifying selection, we would expect genes showing extensive genic constraint to be more likely to be trait associated than unconstrained genes. Here we improve gene-set analyses by incorporating such external information into a higher-criticism-based signal detection analysis. We show that when this external information is predictive of whether a gene is associated with disease, our approach can lead to a significant increase in power. Further, our approach is particularly powerful when the signal is sparse, that is when only a small number of genes within the set are associated with the trait. We illustrate our approach with a gene-set analysis of amyotrophic lateral sclerosis (ALS) and implicate a number of gene-sets containing SOD1 and NEK1 as well as showing enrichment of small p values for gene-sets containing known ALS genes. We implement our approach in the R package wHC.

KEYWORDS:

amyotrophic lateral sclerosis; gene-set-based analysis; higher criticism; prior information; weighted p values

PMID:
32043633
DOI:
10.1002/gepi.22283

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center