Self-Supervision Closes the Gap Between Weak and Strong Supervision in Histology

2020-12-07 10:59:38

Olivier Dehaene, Axel Camara, Olivier Moindrot, Axel de Lavergne, Pierre Courtiol

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

One of the biggest challenges for applying machine learning to histopathology is weak supervision: whole-slide images have billions of pixels yet often only one global label. The state of the art therefore relies on strongly-supervised model training using additional local annotations from domain experts. However, in the absence of detailed annotations, most weakly-supervised approaches depend on a frozen feature extractor pre-trained on ImageNet. We identify this as a key weakness and propose to train an in-domain feature extractor on histology images using MoCo v2, a recent self-supervised learning algorithm. Experimental results on Camelyon16 and TCGA show that the proposed extractor greatly outperforms its ImageNet counterpart. In particular, our results improve the weakly-supervised state of the art on Camelyon16 from 91.4% to 98.7% AUC, thereby closing the gap with strongly-supervised models that reach 99.3% AUC. Through these experiments, we demonstrate that feature extractors trained via self-supervised learning can act as drop-in replacements to significantly improve existing machine learning techniques in histology. Lastly, we show that the learned embedding space exhibits biologically meaningful separation of tissue structures.

Abstract (translated)

URL

https://arxiv.org/abs/2012.03583

PDF

https://arxiv.org/pdf/2012.03583.pdf