Enhancing Annotation Efficiency with Machine Learning: Automated Partitioning of a Lung Ultrasound Dataset by View

Bennett VanBerlo; Delaney Smith; Jared Tschirhart; Blake VanBerlo; Derek Wu; Alex Ford; Joseph McCauley; Benjamin Wu; Rushil Chaudhary; Chintan Dave; Jordan Ho; Jason Deglint; Brian Li; Robert Arntfield

doi:10.3390/diagnostics12102351

Enhancing Annotation Efficiency with Machine Learning: Automated Partitioning of a Lung Ultrasound Dataset by View

Diagnostics (Basel). 2022 Sep 28;12(10):2351. doi: 10.3390/diagnostics12102351.

Authors

Bennett VanBerlo¹, Delaney Smith², Jared Tschirhart³, Blake VanBerlo², Derek Wu⁴, Alex Ford⁵, Joseph McCauley⁶, Benjamin Wu⁵, Rushil Chaudhary⁴, Chintan Dave⁷, Jordan Ho⁸, Jason Deglint⁶, Brian Li⁶, Robert Arntfield⁷

Affiliations

¹ Faculty of Engineering, University of Western Ontario, London, ON N6A 5C1, Canada.
² Faculty of Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada.
³ Schulich School of Medicine and Dentistry, Western University, London, ON N6A 5C1, Canada.
⁴ Department of Medicine, Western University, London, ON N6A 5C1, Canada.
⁵ Lawson Health Research Institute, London, ON N6C 2R5, Canada.
⁶ Faculty of Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada.
⁷ Division of Critical Care Medicine, Western University, London, ON N6A 5C1, Canada.
⁸ Department of Family Medicine, Western University, London, ON N6A 5C1, Canada.

Abstract

Background: Annotating large medical imaging datasets is an arduous and expensive task, especially when the datasets in question are not organized according to deep learning goals. Here, we propose a method that exploits the hierarchical organization of annotating tasks to optimize efficiency.

Methods: We trained a machine learning model to accurately distinguish between one of two classes of lung ultrasound (LUS) views using 2908 clips from a larger dataset. Partitioning the remaining dataset by view would reduce downstream labelling efforts by enabling annotators to focus on annotating pathological features specific to each view.

Results: In a sample view-specific annotation task, we found that automatically partitioning a 780-clip dataset by view saved 42 min of manual annotation time and resulted in 55±6 additional relevant labels per hour.

Conclusions: Automatic partitioning of a LUS dataset by view significantly increases annotator efficiency, resulting in higher throughput relevant to the annotating task at hand. The strategy described in this work can be applied to other hierarchical annotation schemes.

Keywords: annotation; computer vision; deep learning; labelling; lung ultrasound; machine learning; medical imaging.

Grants and funding

This research received no external funding.