Paper Reading AI Learner

Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution

2024-04-16 16:08:59
Yutao Yuan, Chun Yuan

Abstract

Image super-resolution is a fundamentally ill-posed problem because multiple valid high-resolution images exist for one low-resolution image. Super-resolution methods based on diffusion probabilistic models can deal with the ill-posed nature by learning the distribution of high-resolution images conditioned on low-resolution images, avoiding the problem of blurry images in PSNR-oriented methods. However, existing diffusion-based super-resolution methods have high time consumption with the use of iterative sampling, while the quality and consistency of generated images are less than ideal due to problems like color shifting. In this paper, we propose Efficient Conditional Diffusion Model with Probability Flow Sampling (ECDP) for image super-resolution. To reduce the time consumption, we design a continuous-time conditional diffusion model for image super-resolution, which enables the use of probability flow sampling for efficient generation. Additionally, to improve the consistency of generated images, we propose a hybrid parametrization for the denoiser network, which interpolates between the data-predicting parametrization and the noise-predicting parametrization for different noise scales. Moreover, we design an image quality loss as a complement to the score matching loss of diffusion models, further improving the consistency and quality of super-resolution. Extensive experiments on DIV2K, ImageNet, and CelebA demonstrate that our method achieves higher super-resolution quality than existing diffusion-based image super-resolution methods while having lower time consumption. Our code is available at this https URL.

Abstract (translated)

图像超分辨率是一个基本不满足问题的问题,因为针对一个低分辨率图像存在多个高分辨率图像。基于扩散概率模型的超分辨率方法通过学习基于低分辨率图像的高分辨率图像的概率分布来解决不满足问题,避免了PSNR导向方法中的模糊图像问题。然而,现有的基于扩散的超分辨率方法在迭代采样过程中具有高时间消耗,生成的图像的质量和不一致性不如理想,因为存在诸如颜色偏移等问题。在本文中,我们提出了用于图像超分辨率的有条件扩散模型概率流采样(ECDP)。为了减少时间消耗,我们设计了一个连续时间条件扩散模型,使得概率流采样能够用于高效的图像生成。此外,为了提高生成的图像的一致性,我们提出了一个混合参数化方法,该方法在数据预测参数化和噪声预测参数化之间进行平滑。此外,我们还设计了一个图像质量损失作为扩散模型分数匹配损失的补充,进一步提高了超分辨率的一致性和质量。在DIV2K、ImageNet和CelebA等数据集上进行的大量实验证明,我们的方法在超分辨率质量上优于现有的扩散基图像超分辨率方法,同时具有较低的时间消耗。我们的代码可在此处访问:https://www.kazuhiko.me/ECDP-SUPER-RESOLUTION

URL

https://arxiv.org/abs/2404.10688

PDF

https://arxiv.org/pdf/2404.10688.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot