Abstract
With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs. However, the highly discrete representation leads to severe accuracy degradation, hindering the quantization of diffusion models to ultra-low bit-widths. In this paper, we propose BinaryDM, a novel accurate quantization-aware training approach to push the weights of diffusion models towards the limit of 1-bit. Firstly, we present a Learnable Multi-basis Binarizer (LMB) to recover the representations generated by the binarized DM, which improves the information in details of representations crucial to the DM. Secondly, a Low-rank Representation Mimicking (LRM) is applied to enhance the binarization-aware optimization of the DM, alleviating the optimization direction ambiguity caused by fine-grained alignment. Moreover, a progressive initialization strategy is applied to training DMs to avoid convergence difficulties. Comprehensive experiments demonstrate that BinaryDM achieves significant accuracy and efficiency gains compared to SOTA quantization methods of DMs under ultra-low bit-widths. As the first binarization method for diffusion models, BinaryDM achieves impressive 16.0 times FLOPs and 27.1 times storage savings with 1-bit weight and 4-bit activation, showcasing its substantial advantages and potential for deploying DMs on resource-limited scenarios.
Abstract (translated)
随着扩散模型(DMs)的进步和计算需求的增加,量化成为获得紧凑和高效的低位位DM的实用解决方案。然而,高度离散的表示导致精确度降低,阻碍了扩散模型的量化达到超低位宽。在本文中,我们提出了BinaryDM,一种新颖的准确量化感知训练方法,将扩散模型的权重推向1位的极限。首先,我们提出了可学习的多基元二进制化器(LMB),以恢复二进制化DM生成的表示,这有助于提高对DM中关键表示信息。其次,低秩表示模拟(LRM)应用于增强DM的量化意识,减轻了微细对齐引起的优化方向不确定性。此外,我们还应用了逐步初始化策略来训练DM,以避免收敛困难。综合实验证明,BinaryDM在超低位宽的DM上实现了显著的准确性和效率提升,与当前的DM量化方法相比。作为扩散模型的第一种二进制化方法,BinaryDM在1个位宽和4个位宽的权重下,取得了令人印象深刻的16.0倍FLOPs和27.1倍存储节省,展示了其在有限资源场景下部署DM的显著优势和潜在。
URL
https://arxiv.org/abs/2404.05662