Network-Informed Gene Ranking Tackles Genetic Heterogeneity in Exome-Sequencing Studies of Monogenic Disease

Hum Mutat. 2015 Dec;36(12):1135-44. doi: 10.1002/humu.22906. Epub 2015 Oct 7.

Abstract

Genetic heterogeneity presents a significant challenge for the identification of monogenic disease genes. Whole-exome sequencing generates a large number of candidate disease-causing variants and typical analyses rely on deleterious variants being observed in the same gene across several unrelated affected individuals. This is less likely to occur for genetically heterogeneous diseases, making more advanced analysis methods necessary. To address this need, we present HetRank, a flexible gene-ranking method that incorporates interaction network data. We first show that different genes underlying the same monogenic disease are frequently connected in protein interaction networks. This motivates the central premise of HetRank: those genes carrying potentially pathogenic variants and whose network neighbors do so in other affected individuals are strong candidates for follow-up study. By simulating 1,000 exome sequencing studies (20,000 exomes in total), we model varying degrees of genetic heterogeneity and show that HetRank consistently prioritizes more disease-causing genes than existing analysis methods. We also demonstrate a proof-of-principle application of the method to prioritize genes causing Adams-Oliver syndrome, a genetically heterogeneous rare disease. An implementation of HetRank in R is available via the Website http://sourceforge.net/p/hetrank/.

Keywords: Mendelian; NGS; genetic heterogeneity; interaction networks; monogenic; next generation sequencing; rare disease; variant prioritization; whole-exome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Computer Simulation
  • Epistasis, Genetic
  • Exome*
  • Gene Regulatory Networks
  • Genetic Association Studies / methods*
  • Genetic Diseases, Inborn / genetics
  • Genetic Diseases, Inborn / metabolism
  • Genetic Heterogeneity*
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Protein Interaction Mapping / methods
  • Software*
  • Web Browser