Using deep neural networks and interpretability methods to identify gene expression patterns that predict radiomic features and histology in non-small cell lung cancer

J Med Imaging (Bellingham). 2021 May;8(3):031906. doi: 10.1117/1.JMI.8.3.031906. Epub 2021 May 8.

Abstract

Purpose: Integrative analysis combining diagnostic imaging and genomic information can uncover biological insights into lesions that are visible on radiologic images. We investigate techniques for interrogating a deep neural network trained to predict quantitative image (radiomic) features and histology from gene expression in non-small cell lung cancer (NSCLC). Approach: Using 262 training and 89 testing cases from two public datasets, deep feedforward neural networks were trained to predict the values of 101 computed tomography (CT) radiomic features and histology. A model interrogation method called gene masking was used to derive the learned associations between subsets of genes and a radiomic feature or histology class [adenocarcinoma (ADC), squamous cell, and other]. Results: Overall, neural networks outperformed other classifiers. In testing, neural networks classified histology with area under the receiver operating characteristic curves (AUCs) of 0.86 (ADC), 0.91 (squamous cell), and 0.71 (other). Classification performance of radiomics features ranged from 0.42 to 0.89 AUC. Gene masking analysis revealed new and previously reported associations. For example, hypoxia genes predicted histology ( > 0.90 AUC ). Previously published gene signatures for classifying histology were also predictive in our model ( > 0.80 AUC ). Gene sets related to the immune or cardiac systems and cell development processes were predictive ( > 0.70 AUC ) of several different radiomic features. AKT signaling, tumor necrosis factor, and Rho gene sets were each predictive of tumor textures. Conclusions: This work demonstrates neural networks' ability to map gene expressions to radiomic features and histology types in NSCLC and to interpret the models to identify predictive genes associated with each feature or type.

Keywords: deep learning; gene expression; model interpretability; non-small cell lung cancer; radiogenomics.