Identifying overlapping mutated driver pathways by constructing gene networks in cancer

BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S3. doi: 10.1186/1471-2105-16-S5-S3. Epub 2015 Mar 18.

Abstract

Background: Large-scale cancer genomic projects are providing lots of data on genomic, epigenomic and gene expression aberrations in many cancer types. One key challenge is to detect functional driver pathways and to filter out nonfunctional passenger genes in cancer genomics. Vandin et al. introduced the Maximum Weight Sub-matrix Problem to find driver pathways and showed that it is an NP-hard problem.

Methods: To find a better solution and solve the problem more efficiently, we present a network-based method (NBM) to detect overlapping driver pathways automatically. This algorithm can directly find driver pathways or gene sets de novo from somatic mutation data utilizing two combinatorial properties, high coverage and high exclusivity, without any prior information. We firstly construct gene networks based on the approximate exclusivity between each pair of genes using somatic mutation data from many cancer patients. Secondly, we present a new greedy strategy to add or remove genes for obtaining overlapping gene sets with driver mutations according to the properties of high exclusivity and high coverage.

Results: To assess the efficiency of the proposed NBM, we apply the method on simulated data and compare results obtained from the NBM, RME, Dendrix and Multi-Dendrix. NBM obtains optimal results in less than nine seconds on a conventional computer and the time complexity is much less than the three other methods. To further verify the performance of NBM, we apply the method to analyze somatic mutation data from five real biological data sets such as the mutation profiles of 90 glioblastoma tumor samples and 163 lung carcinoma samples. NBM detects groups of genes which overlap with known pathways, including P53, RB and RTK/RAS/PI(3)K signaling pathways. New gene sets with p-value less than 1e-3 are found from the somatic mutation data.

Conclusions: NBM can detect more biologically relevant gene sets. Results show that NBM outperforms other algorithms for detecting driver pathways or gene sets. Further research will be conducted with the use of novel machine learning techniques.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Female
  • Gene Regulatory Networks*
  • Genomics / methods
  • Glioblastoma / genetics
  • Head and Neck Neoplasms / genetics
  • Humans
  • Lung Neoplasms / genetics
  • Mutation / genetics*
  • Neoplasm Proteins / genetics*
  • Neoplasms / genetics*
  • Ovarian Neoplasms / genetics
  • Signal Transduction / genetics*

Substances

  • Neoplasm Proteins