Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

2021-01-19 09:27:03

Zeming Li, Songtao Liu, Jian Sun

arXiv_CV

arXiv_CV Self-Supervised

Abstract
Abstract (translated)
URL
PDF

Abstract

In this paper, we present a novel approach, Momentum$^2$ Teacher, for student-teacher based self-supervised learning. The approach performs momentum update on both network weights and batch normalization (BN) statistics. The teacher's weight is a momentum update of the student, and the teacher's BN statistics is a momentum update of those in history. The Momentum$^2$ Teacher is simple and efficient. It can achieve the state of the art results (74.5\%) under ImageNet linear evaluation protocol using small-batch size(\eg, 128), without requiring large-batch training on special hardware like TPU or inefficient across GPU operation (\eg, shuffling BN, synced BN). Our implementation and pre-trained models will be given on GitHub\footnote{this https URL}.

Abstract (translated)

URL

https://arxiv.org/abs/2101.07525

PDF

https://arxiv.org/pdf/2101.07525.pdf