Send to

Choose Destination
Clin Cancer Res. 2018 Dec 1;24(23):5902-5909. doi: 10.1158/1078-0432.CCR-18-1115. Epub 2018 Oct 11.

Deep Learning to Distinguish Recalled but Benign Mammography Images in Breast Cancer Screening.

Author information

Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania.
Department of Radiology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania.
Magee-Womens Hospital of University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania.
Departments of Radiology, of Biomedical Informatics, of Bioengineering, and of Intelligent Systems, University of Pittsburgh, Pittsburgh, Pennsylvania.



False positives in digital mammography screening lead to high recall rates, resulting in unnecessary medical procedures to patients and health care costs. This study aimed to investigate the revolutionary deep learning methods to distinguish recalled but benign mammography images from negative exams and those with malignancy.


Deep learning convolutional neural network (CNN) models were constructed to classify mammography images into malignant (breast cancer), negative (breast cancer free), and recalled-benign categories. A total of 14,860 images of 3,715 patients from two independent mammography datasets: Full-Field Digital Mammography Dataset (FFDM) and a digitized film dataset, Digital Dataset of Screening Mammography (DDSM), were used in various settings for training and testing the CNN models. The ROC curve was generated and the AUC was calculated as a metric of the classification accuracy.


Training and testing using only the FFDM dataset resulted in AUC ranging from 0.70 to 0.81. When the DDSM dataset was used, AUC ranged from 0.77 to 0.96. When datasets were combined for training and testing, AUC ranged from 0.76 to 0.91. When pretrained on a large nonmedical dataset and DDSM, the models showed consistent improvements in AUC ranging from 0.02 to 0.05 (all P > 0.05), compared with pretraining only on the nonmedical dataset.


This study demonstrates that automatic deep learning CNN methods can identify nuanced mammographic imaging features to distinguish recalled-benign images from malignant and negative cases, which may lead to a computerized clinical toolkit to help reduce false recalls.

[Available on 2019-12-01]

Supplemental Content

Full text links

Icon for HighWire
Loading ...
Support Center