Abstract
We present DeepCABAC, a novel context-adaptive binary arithmetic coder for compressing deep neural networks. It quantizes each weight parameter by minimizing a weighted rate-distortion function, which implicitly takes the impact of quantization on to the accuracy of the network into account. Subsequently, it compresses the quantized values into a bitstream representation with minimal redundancies. We show that DeepCABAC is able to reach very high compression ratios across a wide set of different network architectures and datasets. For instance, we are able to compress by x63.6 the VGG16 ImageNet model with no loss of accuracy, thus being able to represent the entire network with merely 8.7MB.
Abstract (translated)
提出了一种新的压缩深度神经网络的上下文自适应二进制算法编码器DEEPCABAC。它通过最小化加权率失真函数来量化每个权重参数,而加权率失真函数隐式地考虑了量化对网络精度的影响。随后,它将量化值压缩成具有最小冗余的位流表示。我们表明,deepcabac能够在一组不同的网络架构和数据集中达到非常高的压缩比。例如,我们可以用X63.6压缩VGG16图像网模型,而不会损失精度,因此仅用8.7MB就可以表示整个网络。
URL
https://arxiv.org/abs/1905.08318