A review of image analysis and machine learning techniques for automated cervical cancer screening from pap-smear images

Comput Methods Programs Biomed. 2018 Oct:164:15-22. doi: 10.1016/j.cmpb.2018.05.034. Epub 2018 Jun 26.

Abstract

Background and objective: Early diagnosis and classification of a cancer type can help facilitate the subsequent clinical management of the patient. Cervical cancer ranks as the fourth most prevalent cancer affecting women worldwide and its early detection provides the opportunity to help save life. To that end, automated diagnosis and classification of cervical cancer from pap-smear images has become a necessity as it enables accurate, reliable and timely analysis of the condition's progress. This paper presents an overview of the state of the art as articulated in prominent recent publications focusing on automated detection of cervical cancer from pap-smear images.

Methods: The survey reviews publications on applications of image analysis and machine learning in automated diagnosis and classification of cervical cancer from pap-smear images spanning 15 years. The survey reviews 30 journal papers obtained electronically through four scientific databases (Google Scholar, Scopus, IEEE and Science Direct) searched using three sets of keywords: (1) segmentation, classification, cervical cancer; (2) medical imaging, machine learning, pap-smear; (3) automated system, classification, pap-smear.

Results: Most of the existing algorithms facilitate an accuracy of nearly 93.78% on an open pap-smear data set, segmented using CHAMP digital image software. K-nearest-neighbors and support vector machines algorithms have been reported to be excellent classifiers for cervical images with accuracies of over 99.27% and 98.5% respectively when applied to a 2-class classification problem (normal or abnormal).

Conclusion: The reviewed papers indicate that there are still weaknesses in the available techniques that result in low accuracy of classification in some classes of cells. Moreover, most of the existing algorithms work either on single or on multiple cervical smear images. This accuracy can be increased by varying various parameters such as the features to be extracted, improvement in noise removal, using hybrid segmentation and classification techniques such of multi-level classifiers. Combining K-nearest-neighbors algorithm with other algorithm(s) such as support vector machines, pixel level classifications and including statistical shape models can also improve performance. Further, most of the developed classifiers are tested on accurately segmented images using commercially available software such as CHAMP software. There is thus a deficit of evidence that these algorithms will work in clinical settings found in developing countries (where 85% of cervical cancer incidences occur) that lack sufficient trained cytologists and the funds to buy the commercial segmentation software.

Keywords: Cervical cancer; Classification; Machine learning; Medical imaging; Pap-smear; Pap-smear images; Segmentation.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Diagnosis, Computer-Assisted
  • Early Detection of Cancer / statistics & numerical data
  • Female
  • Humans
  • Image Interpretation, Computer-Assisted
  • Machine Learning
  • Papanicolaou Test / statistics & numerical data*
  • Uterine Cervical Neoplasms / classification
  • Uterine Cervical Neoplasms / diagnosis
  • Uterine Cervical Neoplasms / diagnostic imaging*
  • Vaginal Smears / statistics & numerical data*