Multi-omics integration for neuroblastoma clinical endpoint prediction

Biol Direct. 2018 Apr 3;13(1):5. doi: 10.1186/s13062-018-0207-8.

Abstract

Background: High-throughput methodologies such as microarrays and next-generation sequencing are routinely used in cancer research, generating complex data at different omics layers. The effective integration of omics data could provide a broader insight into the mechanisms of cancer biology, helping researchers and clinicians to develop personalized therapies.

Results: In the context of CAMDA 2017 Neuroblastoma Data Integration challenge, we explore the use of Integrative Network Fusion (INF), a bioinformatics framework combining a similarity network fusion with machine learning for the integration of multiple omics data. We apply the INF framework for the prediction of neuroblastoma patient outcome, integrating RNA-Seq, microarray and array comparative genomic hybridization data. We additionally explore the use of autoencoders as a method to integrate microarray expression and copy number data.

Conclusions: The INF method is effective for the integration of multiple data sources providing compact feature signatures for patient classification with performances comparable to other methods. Latent space representation of the integrated data provided by the autoencoder approach gives promising results, both by improving classification on survival endpoints and by providing means to discover two groups of patients characterized by distinct overall survival (OS) curves.

Reviewers: This article was reviewed by Djork-Arné Clevert and Tieliu Shi.

Keywords: Autoencoder; Classification; Integration; Neuroblastoma; Prediction.

MeSH terms

  • Animals
  • Computational Biology
  • Gene Expression Profiling
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Neuroblastoma / genetics*
  • Neuroblastoma / metabolism*
  • Neuroblastoma / pathology