Abstract
Diffusion models (DMs) are capable of generating remarkably high-quality samples by iteratively denoising a random vector, a process that corresponds to moving along the probability flow ordinary differential equation (PF ODE). Interestingly, DMs can also invert an input image to noise by moving backward along the PF ODE, a key operation for downstream tasks such as interpolation and image editing. However, the iterative nature of this process restricts its speed, hindering its broader application. Recently, Consistency Models (CMs) have emerged to address this challenge by approximating the integral of the PF ODE, thereby bypassing the need to iterate. Yet, the absence of an explicit ODE solver complicates the inversion process. To resolve this, we introduce the Bidirectional Consistency Model (BCM), which learns a single neural network that enables both forward and backward traversal along the PF ODE, efficiently unifying generation and inversion tasks within one framework. Notably, our proposed method enables one-step generation and inversion while also allowing the use of additional steps to enhance generation quality or reduce reconstruction error. Furthermore, by leveraging our model's bidirectional consistency, we introduce a sampling strategy that can enhance FID while preserving the generated image content. We further showcase our model's capabilities in several downstream tasks, such as interpolation and inpainting, and present demonstrations of potential applications, including blind restoration of compressed images and defending black-box adversarial attacks.
Abstract (translated)
扩散模型(DMs)通过迭代地消噪随机向量来生成高质量的样本,这个过程相当于沿着概率流普通微分方程(PF ODE)移动。有趣的是,DMs还可以通过沿着PF ODE向前移动来反转输入图像,这是下游任务(如插值和图像编辑)的关键操作。然而,这个过程的迭代性质限制了其速度,阻碍了更广泛的应用。最近,一致性模型(CMs)应运而生,通过近似PF ODE的积分来解决这一挑战,从而绕过了迭代需求。然而,缺乏显式的ODE求解器使反向过程变得复杂。为了解决这个问题,我们引入了双向一致性模型(BCM),该模型学习了一个单个神经网络,可以在PF ODE上进行前向和反向遍历,将生成和反向遍历任务在同一个框架内高效地统一起来。值得注意的是,我们所提出的方法可以在一步生成和反向遍历的同时,允许使用额外的步骤来提高生成质量或减少重构误差。此外,通过利用我们模型的双向一致性,我们引入了一种采样策略,可以在保留生成图像内容的同时增强FID。我们还展示了我们模型的能力在多个下游任务中,如插值和修复,并展示了潜在应用的演示,包括恢复压缩图像的盲修复和防御黑盒攻击。
URL
https://arxiv.org/abs/2403.18035