ResMLP: Feedforward networks for image classification with data-efficient training

2021-05-07 17:31:44

Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou

arXiv_CV

arXiv_CV Classification Image_Classification

Abstract
Abstract (translated)
URL
PDF

Abstract

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

Abstract (translated)

URL

https://arxiv.org/abs/2105.03404

PDF

https://arxiv.org/pdf/2105.03404.pdf