Structured Knowledge Distillation for Semantic Segmentation

Abstract
Abstract (translated)
URL
PDF

Abstract

In this paper, we investigate the knowledge distillation strategy for training small semantic segmentation networks by making use of large networks. We start from the straightforward scheme, pixel-wise distillation, which applies the distillation scheme adopted for image classification and performs knowledge distillation for each pixel~\emph{separately}. We further propose to distill the \emph{structured} knowledge from large networks to small networks, which is motivated by that semantic segmentation is a structured prediction problem. We study two structured distillation schemes: (i) \emph{pair-wise} distillation that distills the pairwise similarities, and (ii) \emph{holistic} distillation that uses GAN to distill holistic knowledge. The effectiveness of our knowledge distillation approaches is demonstrated by extensive experiments on three scene parsing datasets: Cityscapes, Camvid and ADE20K.

Abstract (translated)

本文研究了利用大网络训练小语义分割网络的知识蒸馏策略。我们从简单的像素蒸馏方案开始，将蒸馏方案应用于图像分类，分别对每个像素~emph进行知识蒸馏。我们还建议将结构化知识从大型网络提取到小型网络，这是一个结构化的预测问题，其动机是语义分割。我们研究了两种结构化蒸馏方案：（i）提取成对相似性的Emph成对蒸馏，以及（ii）使用GaN提取整体知识的Emph整体蒸馏。通过对三个场景分析数据集（Cityscapes、Camvid和ADE20K）的大量实验，证明了我们的知识蒸馏方法的有效性。

URL

https://arxiv.org/abs/1903.04197

PDF

https://arxiv.org/pdf/1903.04197.pdf