Format

Send to

Choose Destination
BMC Med Genomics. 2018 Sep 14;11(Suppl 3):71. doi: 10.1186/s12920-018-0388-0.

Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data.

El-Manzalawy Y1,2,3, Hsieh TY1,4,2, Shivakumar M5, Kim D6,7, Honavar V8,9,10,11,12.

Author information

1
Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA.
2
The Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA, 16802, USA.
3
The Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, PA, 16802, USA.
4
School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA, 16802, USA.
5
Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA, USA.
6
Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA, USA. dkim@geisinger.edu.
7
The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA. dkim@geisinger.edu.
8
Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA. vhonavar@ist.psu.edu.
9
The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA. vhonavar@ist.psu.edu.
10
School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA, 16802, USA. vhonavar@ist.psu.edu.
11
The Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA, 16802, USA. vhonavar@ist.psu.edu.
12
The Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, PA, 16802, USA. vhonavar@ist.psu.edu.

Abstract

BACKGROUND:

Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene expression, and protein expression, offer tantalizing possibilities for realizing the promise and potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including heterogeneity, and high-dimensionality of omics data.

METHODS:

We propose a novel framework for multi-omics data integration using multi-view feature selection. We introduce a novel multi-view feature selection algorithm, MRMR-mv, an adaptation of the well-known Min-Redundancy and Maximum-Relevance (MRMR) single-view feature selection algorithm to the multi-view setting.

RESULTS:

We report results of experiments using an ovarian cancer multi-omics dataset derived from the TCGA database on the task of predicting ovarian cancer survival. Our results suggest that multi-view models outperform both view-specific models (i.e., models trained and tested using a single type of omics data) and models based on two baseline data fusion methods.

CONCLUSIONS:

Our results demonstrate the potential of multi-view feature selection in integrative analyses and predictive modeling from multi-omics data.

KEYWORDS:

Cancer survival prediction; Machine learning; Multi-omics data integration; Multi-view feature selection

PMID:
30255801
PMCID:
PMC6157248
DOI:
10.1186/s12920-018-0388-0
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center