Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

Abstract
Abstract (translated)
URL
PDF

Abstract

Weight pruning and weight quantization are two important categories of DNN model compression. Prior work on these techniques are mainly based on heuristics. A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results. In this work, we first extend such one-shot ADMM-based framework to guarantee solution feasibility and provide fast convergence rate, and generalize to weight quantization as well. We have further developed a multi-step, progressive DNN weight pruning and quantization framework, with dual benefits of (i) achieving further weight pruning/quantization thanks to the special property of ADMM regularization, and (ii) reducing the search space within each step. Extensive experimental results demonstrate the superior performance compared with prior work. Some highlights: (i) we achieve 246x,36x, and 8x weight pruning on LeNet-5, AlexNet, and ResNet-50 models, respectively, with (almost) zero accuracy loss; (ii) even a significant 61x weight pruning in AlexNet (ImageNet) results in only minor degradation in actual accuracy compared with prior work; (iii) we are among the first to derive notable weight pruning results for ResNet and MobileNet models; (iv) we derive the first lossless, fully binarized (for all layers) LeNet-5 for MNIST and VGG-16 for CIFAR-10; and (v) we derive the first fully binarized (for all layers) ResNet for ImageNet with reasonable accuracy loss.

Abstract (translated)

权重修剪和权重量化是DNN模型压缩的两个重要类别。以前对这些技术的研究主要基于启发式方法。最近的一项研究利用先进的优化技术ADMM（乘法器交替方向法）开发了一个系统的DNN重量修剪框架，实现了重量修剪的最新成果之一。在本文中，我们首先扩展了这种基于单镜头ADMM的框架，以保证解决方案的可行性，并提供快速的收敛速度，同时也推广了权重量化。我们进一步开发了一个多步骤、渐进式的dnn权值修剪和量化框架，具有以下双重好处：（i）由于admm正则化的特殊性，进一步实现权值修剪/量化；（ii）减少每个步骤中的搜索空间。大量的实验结果表明，与以往的工作相比，其性能优越。一些亮点：（i）我们在Lenet-5、Alexnet和Resnet-50模型上分别实现了246x、36x和8x的重量修剪，精度损失几乎为零；（ii）甚至Alexnet（Imagenet）中的61x重量修剪也只会导致实际精度与之前的工作相比略有下降；（iii）我们是第一个得出显著Wei的人。resnet和mobilenet模型的光修剪结果；（iv）我们推导出第一个无损、完全二值化（对于所有层）mnist的lenet-5和cifar-10的vgg-16；（v）我们推导出第一个完全二值化（对于所有层）的resnet，具有合理的精度损失。

URL

https://arxiv.org/abs/1903.09769

PDF

https://arxiv.org/pdf/1903.09769.pdf