Distilling Knowledge from Ensembles of Cluster-Constrained-Attention Multiple-Instance Learners for Whole Slide Image Classification

Show authors

Publication Type

Conference Paper

Book Title

2022 IEEE International Conference on Big Data (Big Data)

Publication Date

December, 2022

Page Numbers

3393 to 3397

Publisher Location

New Jersey, United States of America

Conference Name

The 4th International Workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery

Conference Location

Osaka, Japan

Conference Sponsor

IEEE

Conference Date

Dec 17, 2022

View DOI Listing

Abstract

The peculiar nature of whole slide imaging (WSI), digitizing conventional glass slides to obtain multiple high resolution images which capture microscopic details of a patient’s histopathological features, has garnered increased interest from the computer vision research community over the last two decades. Given the unique computational space and time complexity inherent to gigapixel-size whole slide image data, researchers have proposed novel machine learning algorithms to aid in the performance of diagnostic tasks in clinical pathology. One effective algorithm represents a Whole slide image as a bag of smaller image patches, which can be represented as low-dimension image patch embeddings. Weakly supervised deep-learning methods, such as cluster-constrained-attention multiple instance learning (CLAM), have shown promising results when combined with image patch embeddings. While traditional ensemble classifiers yield improved task performance, such methods come with a steep cost in model complexity. Through knowledge distillation, it is possible to retain some performance improvements from an ensemble, while minimizing costs to model complexity. In this work, we implement a weakly supervised ensemble using clustering-constrained-attention multiple-instance learners (CLAM), which uses attention and instance-level clustering to identify task salient regions and feature extraction in whole slides. By applying logit-based and attention-based knowledge distillation, we show it is possible to retain some performance improvements resulting from the ensemble at zero cost to model complexity.

Distilling Knowledge from Ensembles of Cluster-Constrained-Attention Multiple-Instance Learners for Whole Slide Image Classification

Abstract

Researchers

Organizations