Purpose: The primary aim of the present study was to test the feasibility of predicting diagnostic errors in mammography by merging radiologists’ gaze behavior and image characteristics. A secondary aim was to investigate group-based and personalized predictive models for radiologists of variable experience levels.
Methods: The study was performed for the clinical task of assessing the likelihood of malignancy of mammographic masses. Eye-tracking data and diagnostic decisions for 40 cases were acquired from 4 Radiology residents and 2 breast imaging experts as part of an IRB-approved pilot study. Gaze behavior features were extracted from the eye-tracking data. Computer-generated and BIRADs images features were extracted from the images. Finally, machine learning algorithms were used to merge gaze and image features for predicting human error. Feature selection was thoroughly explored to determine the relative contribution of the various features. Group-based and personalized user modeling was also investigated.
Results: Diagnostic error can be predicted reliably by merging gaze behavior characteristics from the radiologist and textural characteristics from the image under review. Leveraging data collected from multiple readers produced a reasonable group model (AUC=0.79). Personalized user modeling was far more accurate for the more experienced readers (average AUC of 0.837±0.029) than for the less experienced ones (average AUC of 0.667±0.099). The best performing group-based and personalized predictive models involved combinations of both gaze and image features.
Conclusions: Diagnostic errors in mammography can be predicted reliably by leveraging the radiologists’ gaze behavior and image content.