Characterizing and Optimizing Rater Performance for Internet-based Collaborative Labeling

Joshua A Stein; Andrew J Asman; Bennett A Landman

doi:10.1117/12.878412

Characterizing and Optimizing Rater Performance for Internet-based Collaborative Labeling

Proc SPIE Int Soc Opt Eng. 2011 Mar 3:7966:79660M. doi: 10.1117/12.878412.

Authors

Joshua A Stein¹, Andrew J Asman, Bennett A Landman

Affiliation

¹ Electrical Engineering, Vanderbilt University, Nashville, TN, USA 37235.

Abstract

Labeling structures on medical images is crucial in determining clinically relevant correlations with morphometric and volumetric features. For the exploration of new structures and new imaging modalities, validated automated methods do not yet exist, and so researchers must rely on manually drawn landmarks. Voxel-by-voxel labeling can be extremely resource intensive, so large-scale studies are problematic. Recently, statistical approaches and software have been proposed to enable Internet-based collaborative labeling of medical images. While numerous labeling software tools have been created, the use of these packages as high-throughput labeling systems has yet to become entirely viable given training requirements. Herein, we explore two modifications to a typical mouse-based labeling system: (1) a platform independent overlay for recognition of mouse gestures and (2) an inexpensive touch-screen tracking device for non-mouse input. Through this study we characterize rater reliability in point, line, curve, and region placement. For the mouse input, we find a placement accuracy of 2.48±5.29 pixels (point), 0.630±1.81 pixels (curve), 1.234±6.99 pixels (line), and 0.058±0.027 (1 - Jaccard Index for region). The gesture software increased labeling speed by 27% overall and accuracy by approximately 30-50% on point and line tracing tasks, but the touch screen module lead to slower and more error prone labeling on all tasks, likely due to relatively poor sensitivity. In summary, the mouse gesture integration layer runs as a seamless operating system overlay and could potentially benefit any labeling software; yet, the inexpensive touch screen system requires improved usability optimization and calibration before it can provide an efficient labeling system.

Abstract

Grants and funding