Machine Learning Diagnosis of Small-Bowel Crohn Disease Using T2-Weighted MRI Radiomic and Clinical Data

AJR Am J Roentgenol. 2024 Jan;222(1):e2329812. doi: 10.2214/AJR.23.29812. Epub 2023 Aug 2.

Abstract

BACKGROUND. Radiologists have variable diagnostic performance and considerable interreader variability when interpreting MR enterography (MRE) examinations for suspected Crohn disease (CD). OBJECTIVE. The purposes of this study were to develop a machine learning method for predicting ileal CD by use of radiomic features of ileal wall and mesenteric fat from noncontrast T2-weighted MRI and to compare the performance of the method with that of expert radiologists. METHODS. This single-institution study included retrospectively identified patients who underwent MRE for suspected ileal CD from January 1, 2020, to January 31, 2021, and prospectively enrolled participants (patients with newly diagnosed ileal CD or healthy control participants) from December 2018 to October 2021. Using axial T2-weighted SSFSE images, a radiologist selected two slices showing greatest terminal ileal wall thickening. Four ROIs were segmented, and radiomic features were extracted from each ROI. After feature selection, support-vector machine models were trained to classify the presence of ileal CD. Three fellowship-trained pediatric abdominal radiologists independently classified the presence of ileal CD on SSFSE images. The reference standard was clinical diagnosis of ileal CD based on endoscopy and biopsy results. Radiomic-only, clinical-only, and radiomic-clinical ensemble models were trained and evaluated by nested cross-validation. RESULTS. The study included 135 participants (67 female, 68 male; mean age, 15.2 ± 3.2 years); 70 were diagnosed with ileal CD. The three radiologists had accuracies of 83.7% (113/135), 88.1% (119/135), and 86.7% (117/135) for diagnosing CD; consensus accuracy was 88.1%. Interradiologist agreement was substantial (κ = 0.78). The best-performing ROI was bowel core (AUC, 0.95; accuracy, 89.6%); other ROIs had worse performance (whole-bowel AUC, 0.86; fat-core AUC, 0.70; whole-fat AUC, 0.73). For the clinical-only model, AUC was 0.85 and accuracy was 80.0%. The ensemble model combining bowel-core radiomic and clinical models had AUC of 0.98 and accuracy of 93.5%. The bowel-core radiomic-only model had significantly greater accuracy than radiologist 1 (p = .009) and radiologist 2 (p = .02) but not radiologist 3 (p > .99) or the radiologists in consensus (p = .05). The ensemble model had greater accuracy than the radiologists in consensus (p = .02). CONCLUSION. A radiomic machine learning model predicted CD diagnosis with better performance than two of three expert radiologists. Model performance improved when radiomic data were ensembled with clinical data. CLINICAL IMPACT. Deployment of a radiomic-based model including T2-weighted MRI data could decrease interradiologist variability and increase diagnostic accuracy for pediatric CD.

Keywords: Crohn disease; inflammatory bowel disease; machine learning; radiomics.

MeSH terms

  • Adolescent
  • Child
  • Crohn Disease*
  • Female
  • Humans
  • Ileal Diseases*
  • Machine Learning
  • Magnetic Resonance Imaging / methods
  • Male
  • Radiomics
  • Retrospective Studies