Switchable Whitening for Deep Representation Learning

Abstract
Abstract (translated)
URL
PDF

Abstract

Normalization methods are essential components in convolutional neural networks (CNNs). They either standardize or whiten data using statistics estimated in predefined sets of pixels. Unlike existing works that design normalization techniques for specific tasks, we propose Switchable Whitening (SW), which provides a general form unifying different whitening methods as well as standardization methods. SW learns to switch among these operations in an end-to-end manner. It has several advantages. First, SW adaptively selects appropriate whitening or standardization statistics for different tasks (see Fig.1), making it well suited for a wide range of tasks without manual design. Second, by integrating benefits of different normalizers, SW shows consistent improvements over its counterparts in various challenging benchmarks. Third, SW serves as a useful tool for understanding the characteristics of whitening and standardization techniques. We show that SW outperforms other alternatives on image classification (CIFAR-10/100, ImageNet), semantic segmentation (ADE20K, Cityscapes), domain adaptation (GTA5, Cityscapes), and image style transfer (COCO). For example, without bells and whistles, we achieve state-of-the-art performance with 45.33% mIoU on the ADE20K dataset. Code and models will be released.

Abstract (translated)

归一化方法是卷积神经网络的重要组成部分。它们要么标准化数据，要么使用预先定义的像素集估计的统计数据来增白数据。与现有的为特定任务设计标准化技术的工作不同，我们提出了可切换美白（sw），它提供了统一不同美白方法和标准化方法的通用形式。软件学习以端到端的方式在这些操作之间切换。它有几个优点。首先，软件自适应地为不同的任务选择适当的增白或标准化统计数据（见图1），使其非常适合无需手动设计的各种任务。第二，通过整合不同规格化器的优点，软件在各种具有挑战性的基准测试中显示出与对应标准相比的一致性改进。第三，软件是了解美白和标准化技术特点的有用工具。我们发现，sw在图像分类（cifar-10/100，imagenet）、语义分割（ade20k，cityscapes）、域适应（gta5，cityscapes）和图像样式转换（coco）方面优于其他替代方案。例如，没有铃声和口哨，我们在ADE20K数据集上使用45.33%的MIOU实现了最先进的性能。将发布代码和模型。

URL

https://arxiv.org/abs/1904.09739

PDF

https://arxiv.org/pdf/1904.09739.pdf