On the Importance of Noise Scheduling for Diffusion Models

Abstract
Abstract (translated)
URL
PDF

Abstract

We empirically study the effect of noise scheduling strategies for denoising diffusion generative models. There are three findings: (1) the noise scheduling is crucial for the performance, and the optimal one depends on the task (e.g., image sizes), (2) when increasing the image size, the optimal noise scheduling shifts towards a noisier one (due to increased redundancy in pixels), and (3) simply scaling the input data by a factor of $b$ while keeping the noise schedule function fixed (equivalent to shifting the logSNR by $\log b$) is a good strategy across image sizes. This simple recipe, when combined with recently proposed Recurrent Interface Network (RIN), yields state-of-the-art pixel-based diffusion models for high-resolution images on ImageNet, enabling single-stage, end-to-end generation of diverse and high-fidelity images at 1024$\times$1024 resolution for the first time (without upsampling/cascades).

Abstract (translated)

我们经验证了对去噪扩散生成模型噪声 scheduling策略的影响。有三个发现:(1) 噪声 scheduling 对性能至关重要,最佳的调度取决于任务(例如图像大小),(2) 增加图像大小,最佳的噪声 scheduling 向噪声更大的方向移动(由于像素冗余的增加),(3) 只需要将输入数据乘以 $b$ 的倍数,同时保持噪声调度函数固定(等价于将 logSNR 向上移动 $log b$ 倍),适用于不同图像大小的最佳策略。这个简单的食谱与最近提出的循环接口网络 (RIN) 结合,生成了 ImageNet 上高分辨率图像的像素级扩散模型,首次实现了单阶段、端到端的生成多种、高保真度的图像,分辨率为 1024$ imes$1024,而无需插值或级联。

URL

https://arxiv.org/abs/2301.10972

PDF

https://arxiv.org/pdf/2301.10972.pdf