[Agreement evaluation of the severity of oral epithelial dysplasia in oral leukoplakia]

Zhonghua Kou Qiang Yi Xue Za Zhi. 2022 Sep 9;57(9):921-926. doi: 10.3760/cma.j.cn112144-20211206-00537.
[Article in Chinese]

Abstract

Objective: To evaluate the inter-observer agreement of the severity of oral epithelial dysplasia in oral leukoplakia, providing a theoretical basis for the development of a more objective grading system. Methods: This study included 60 digital pathological slides of oral leukoplakia from Oral Medicine Department of West China Hospital of Stomatology, Sichuan University, and 239 tissue microarray images of oral leukoplakia from State Key Laboratory of Oral Diseases, Sichuan University, to evaluate the agreement of grading. Besides, 1 000 patches were generated from the 60 digital pathological slides and were divided into 500 small-sized patches (224 pixel×224 pixel) and 500 large-sized patches (1 024 pixel×1 024 pixel), to evaluate the agreement of feature detection. Gradings and feature detections were completed by three pathological experts from the oral pathology departments of two Grade 3, Class A stomatological hospitals in China. Kappa coefficient was used to quantify the inter-observer agreement among pathologists. Results: Minimal agreement was found in the grading of oral epithelial dysplasia among pathologists (Kappa=0.30 in the pathological slide group, Kappa=0.30 in the tissue microarray group). None agreement was found in feature detection within the small-sized patches group (median Kappa=0.14 for architectural features, median Kappa=0.18 for cytological features), and minimal agreement was found in feature detection within the large-sized patches group (median Kappa=0.25 for architectural features, median Kappa=0.25 for cytological features). Conclusions: Generally, the agreement of grading and feature detection of oral epithelial dysplasia in oral leukoplakia is poor. Development of a more objective grading system of oral epithelial dysplasia based on artificial intelligence may be helpful to improve the agreement.

目的: 评估不同观察者间上皮异常增生严重程度判定的一致性,为进一步开发客观性更高的判定系统提供理论依据。 方法: 收集四川大学华西口腔医院口腔黏膜病科2013年5月至2018年5月60例口腔白斑病患者的数字化病理切片图像60张,以及四川大学华西口腔医院口腔疾病研究国家重点实验室2004年9月至2014年9月93例口腔白斑病患者的组织芯片图像239张,用于评价上皮异常增生分级的一致性;并从以上60张数字化病理切片图像中制作1 000张图块,分别为500张小尺寸图块(224像素×224像素)与500张大尺寸图块(1 024像素×1 024像素),用于评价特征识别的一致性。分级与特征识别由来自国内两个三级甲等口腔专科医院口腔病理科的3名病理专家完成。使用Kappa系数量化病理专家的观察者间一致性。 结果: 上皮异常增生分级的一致性很弱(病理切片组Kappa=0.30,组织芯片组Kappa=0.30),小尺寸图块中特征识别无一致性(结构特征中位数Kappa=0.14,细胞特征中位数Kappa=0.18),大尺寸图块中特征识别一致性很弱(结构特征中位数Kappa=0.25,细胞特征中位数Kappa=0.25)。 结论: 口腔白斑病上皮异常增生分级与特征识别的一致性总体较差,开发客观性更高的基于人工智能的上皮异常增生程度判定系统,或有助于提高不同观察者间上皮异常增生严重程度判定的一致性。.

MeSH terms

  • Artificial Intelligence*
  • China
  • Humans
  • Leukoplakia, Oral
  • Observer Variation
  • Precancerous Conditions*