Abstract
Denoising Probabilistic Models (DPMs) represent an emerging domain of generative models that excel in generating diverse and high-quality images. However, most current training methods for DPMs often neglect the correlation between timesteps, limiting the model's performance in generating images effectively. Notably, we theoretically point out that this issue can be caused by the cumulative estimation gap between the predicted and the actual trajectory. To minimize that gap, we propose a novel \textit{sequence-aware} loss that aims to reduce the estimation gap to enhance the sampling quality. Furthermore, we theoretically show that our proposed loss function is a tighter upper bound of the estimation loss in comparison with the conventional loss in DPMs. Experimental results on several benchmark datasets including CIFAR10, CelebA, and CelebA-HQ consistently show a remarkable improvement of our proposed method regarding the image generalization quality measured by FID and Inception Score compared to several DPM baselines. Our code and pre-trained checkpoints are available at \url{this https URL}.
Abstract (translated)
滤波概率模型(DPMs)代表了一种新兴的生成模型领域,在生成多样且高质量图像方面表现出色。然而,大多数现有的DPM训练方法往往忽视了时间步之间的相关性,从而限制了模型在生成图像方面的有效性能。值得注意的是,我们理论性地指出,这个问题可以是由预测和实际轨迹的累积估计差距引起的。为了最小化这个差距,我们提出了一个名为序列感知损失的新损失函数,旨在降低估计差距以提高抽样质量。此外,我们理论性地证明了与DPMs中的传统损失相比,我们的损失函数是一个更严格的下界。在多个基准数据集(包括CIFAR10、CelebA和CelebA-HQ)上的实验结果表明,与几个DPM基线相比,我们提出的方法在测量FID和Inception分数的图像泛化质量方面显著改进。您可以在此处访问我们的代码和预训练检查点:https://this URL。
URL
https://arxiv.org/abs/2312.12431