Restaining-based annotation for cancer histology segmentation to overcome annotation-related limitations among pathologists

Patterns (N Y). 2023 Feb 10;4(2):100688. doi: 10.1016/j.patter.2023.100688.

Abstract

Numerous cancer histopathology specimens have been collected and digitized over the past few decades. A comprehensive evaluation of the distribution of various cells in tumor tissue sections can provide valuable information for understanding cancer. Deep learning is suitable for achieving these goals; however, the collection of extensive, unbiased training data is hindered, thus limiting the production of accurate segmentation models. This study presents SegPath-the largest annotation dataset (>10 times larger than publicly available annotations)-for the segmentation of hematoxylin and eosin (H&E)-stained sections for eight major cell types in cancer tissue. The SegPath generating pipeline used H&E-stained sections that were destained and subsequently immunofluorescence-stained with carefully selected antibodies. We found that SegPath is comparable with, or outperforms, pathologist annotations. Moreover, annotations by pathologists are biased toward typical morphologies. However, the model trained on SegPath can overcome this limitation. Our results provide foundational datasets for machine-learning research in histopathology.

Keywords: cancer; dataset; deep learning; digital pathology; eosin-stained; hematoxylin; histology; segmentation mask; semantic segmentation.