Gene set analysis of SNP data: benefits, challenges, and future directions

Eur J Hum Genet. 2011 Aug;19(8):837-43. doi: 10.1038/ejhg.2011.57. Epub 2011 Apr 13.

Abstract

The last decade of human genetic research witnessed the completion of hundreds of genome-wide association studies (GWASs). However, the genetic variants discovered through these efforts account for only a small proportion of the heritability of complex traits. One explanation for the missing heritability is that the common analysis approach, assessing the effect of each single-nucleotide polymorphism (SNP) individually, is not well suited to the detection of small effects of multiple SNPs. Gene set analysis (GSA) is one of several approaches that may contribute to the discovery of additional genetic risk factors for complex traits. Complex phenotypes are thought to be controlled by networks of interacting biochemical and physiological pathways influenced by the products of sets of genes. By assessing the overall evidence of association of a phenotype with all measured variation in a set of genes, GSA may identify functionally relevant sets of genes corresponding to relevant biomolecular pathways, which will enable more focused studies of genetic risk factors. This approach may thus contribute to the discovery of genetic variants responsible for some of the missing heritability. With the increased use of these approaches for the secondary analysis of data from GWAS, it is important to understand the different GSA methods and their strengths and weaknesses, and consider challenges inherent in these types of analyses. This paper provides an overview of GSA, highlighting the key challenges, potential solutions, and directions for ongoing research.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Disease / genetics*
  • Genetic Predisposition to Disease*
  • Genome-Wide Association Study / methods*
  • Humans
  • Linkage Disequilibrium
  • Metabolic Networks and Pathways
  • Models, Statistical*
  • Polymorphism, Single Nucleotide*