Performance of deep learning to detect mastoiditis using multiple conventional radiographs of mastoid

PLoS One. 2020 Nov 11;15(11):e0241796. doi: 10.1371/journal.pone.0241796. eCollection 2020.

Abstract

Objectives: This study aimed to compare the diagnostic performance of deep learning algorithm trained by single view (anterior-posterior (AP) or lateral view) with that trained by multiple views (both views together) in diagnosis of mastoiditis on mastoid series and compare the diagnostic performance between the algorithm and radiologists.

Methods: Total 9,988 mastoid series (AP and lateral views) were classified as normal or abnormal (mastoiditis) based on radiographic findings. Among them 792 image sets with temporal bone CT were classified as the gold standard test set and remaining sets were randomly divided into training (n = 8,276) and validation (n = 920) sets by 9:1 for developing a deep learning algorithm. Temporal (n = 294) and geographic (n = 308) external test sets were also collected. Diagnostic performance of deep learning algorithm trained by single view was compared with that trained by multiple views. Diagnostic performance of the algorithm and two radiologists was assessed. Inter-observer agreement between the algorithm and radiologists and between two radiologists was calculated.

Results: Area under the receiver operating characteristic curves of algorithm using multiple views (0.971, 0.978, and 0.965 for gold standard, temporal, and geographic external test sets, respectively) showed higher values than those using single view (0.964/0.953, 0.952/0.961, and 0.961/0.942 for AP view/lateral view of gold standard, temporal external, and geographic external test sets, respectively) in all test sets. The algorithm showed statistically significant higher specificity compared with radiologists (p = 0.018 and 0.012). There was substantial agreement between the algorithm and two radiologists and between two radiologists (κ = 0.79, 0.8, and 0.76).

Conclusion: The deep learning algorithm trained by multiple views showed better performance than that trained by single view. The diagnostic performance of the algorithm for detecting mastoiditis on mastoid series was similar to or higher than that of radiologists.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Deep Learning
  • Humans
  • Mastoid / diagnostic imaging
  • Mastoid / pathology*
  • Mastoiditis / diagnosis*
  • Mastoiditis / diagnostic imaging
  • ROC Curve
  • Retrospective Studies

Grants and funding

IR, Grant No: 2017R1C1B5076240, National Research Foundation of Korea, URL: www.nrf.re.kr; KJL, Grant No: 13-2019-006, Seoul National University Bundang Hospital Research Fund, URL: www.snubh.org. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.