Module Cover: a New Approach to Genotype-Phenotype Studies
Uncovering and interpreting phenotype-genotype relationships are among the most challenging open questions in disease studies. Set cover approaches are recognized as an important class of methods for identification of genes that can serve as disease markers. Explicitly designed to provide a representation for all disease cases, such techniques are valuable in studies of heterogeneous datasets. At the same time pathway-centric methods have emerged as key approaches that significantly empower studies of genotype-phenotype relationships. Combining the utility of set cover techniques with the power of network-centric approaches, we designed a novel approach which extends the concept of set cover to network modules cover. We developed an integrated method that simultaneously determines network modules and optimizes the coverage of disease cases. For comparison, we also implemented a two-step method where we first determined a candidate set of network modules and subsequently selected modules that cover disease cases. We demonstrated that the performance of the integrated approach is superior to the two-step method. We applied our module cover approach to a heterogeneous data set of brain cancer patients. We identified numerous network modules whose activity is perturbed in a coherent way by specific genomic alterations in the disease. While previous module-finding approaches concentrated on identifying modules that are common to all disease cases our approach allows the determination of phenotype-genotype relations by capturing the heterogeneity of the underlying disease dataset.

