Big data visualization identifies the multidimensional molecular landscape of human gliomas

Proc Natl Acad Sci U S A. 2016 May 10;113(19):5394-9. doi: 10.1073/pnas.1601591113. Epub 2016 Apr 26.

Abstract

We show that visualizing large molecular and clinical datasets enables discovery of molecularly defined categories of highly similar patients. We generated a series of linked 2D sample similarity plots using genome-wide single nucleotide alterations (SNAs), copy number alterations (CNAs), DNA methylation, and RNA expression data. Applying this approach to the combined glioblastoma (GBM) and lower grade glioma (LGG) The Cancer Genome Atlas datasets, we find that combined CNA/SNA data divide gliomas into three highly distinct molecular groups. The mutations commonly used in clinical evaluation of these tumors are regionally distributed in these plots. One of the three groups is a mixture of GBM and LGG that shows similar methylation and survival characteristics to GBM. Altogether, our approach identifies eight molecularly defined glioma groups with distinct sequence/expression/methylation profiles. Importantly, we show that regionally clustered samples are enriched for specific drug targets.

Keywords: big data; biomarkers; glioma; precision medicine; visualization.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics
  • Brain Neoplasms
  • Computer Graphics
  • Data Mining / methods*
  • Databases, Genetic*
  • Datasets as Topic*
  • Genetic Predisposition to Disease / genetics
  • Glioma
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Neoplasm Proteins / genetics*
  • User-Computer Interface*

Substances

  • Biomarkers, Tumor
  • Neoplasm Proteins