Layer-Wise Data-Free CNN Compression

2020-11-18 03:00:05

Maxwell Horton, Yanzi Jin, Ali Farhadi, Mohammad Rastegari

arXiv_CV

arXiv_CV Quantization

Abstract
Abstract (translated)
URL
PDF

Abstract

We present an efficient method for compressing a trained neural network without using any data. Our data-free method requires 14x-450x fewer FLOPs than comparable state-of-the-art methods. We break the problem of data-free network compression into a number of independent layer-wise compressions. We show how to efficiently generate layer-wise training data, and how to precondition the network to maintain accuracy during layer-wise compression. We show state-of-the-art performance on MobileNetV1 for data-free low-bit-width quantization. We also show state-of-the-art performance on data-free pruning of EfficientNet B0 when combining our method with end-to-end generative methods.

Abstract (translated)

URL

https://arxiv.org/abs/2011.09058

PDF

https://arxiv.org/pdf/2011.09058.pdf