Send to

Choose Destination
Nucleic Acids Res. 2017 Sep 19;45(16):e151. doi: 10.1093/nar/gkx642.

De novo pathway-based biomarker identification.

Author information

Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark.
Department of Cancer and Inflammation Research, Institute of Molecular Medicine, University of Southern Denmark, 5000 Odense, Denmark.
The Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark.
Computational Biology and Applied Algorithms, Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany.
Institute of Computational Biology, Helmholtz Zentrum München, 85764 Munich, Germany.
Department of Dermatology and Allergy, Technical University of Munich, 80802 Munich, Germany.
Department of Information and Engineering, University of Padowa, 35122 Padowa, Italy.
Department of Oncology, Odense University Hospital, 5000 Odense, Denmark.
Computational Systems Biology Group, Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany.


Gene expression profiles have been extensively discussed as an aid to guide the therapy by predicting disease outcome for the patients suffering from complex diseases, such as cancer. However, prediction models built upon single-gene (SG) features show poor stability and performance on independent datasets. Attempts to mitigate these drawbacks have led to the development of network-based approaches that integrate pathway information to produce meta-gene (MG) features. Also, MG approaches have only dealt with the two-class problem of good versus poor outcome prediction. Stratifying patients based on their molecular subtypes can provide a detailed view of the disease and lead to more personalized therapies. We propose and discuss a novel MG approach based on de novo pathways, which for the first time have been used as features in a multi-class setting to predict cancer subtypes. Comprehensive evaluation in a large cohort of breast cancer samples from The Cancer Genome Atlas (TCGA) revealed that MGs are considerably more stable than SG models, while also providing valuable insight into the cancer hallmarks that drive them. In addition, when tested on an independent benchmark non-TCGA dataset, MG features consistently outperformed SG models. We provide an easy-to-use web service at where users can upload their own gene expression datasets from breast cancer studies and obtain the subtype predictions from all the classifiers.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center