Paper Reading AI Learner

DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion

2024-03-31 12:07:04
Chunyang Bi, Xin Luo, Sheng Shen, Mengxi Zhang, Huanjing Yue, Jingyu Yang

Abstract

Diffusion models, known for their powerful generative capabilities, play a crucial role in addressing real-world super-resolution challenges. However, these models often focus on improving local textures while neglecting the impacts of global degradation, which can significantly reduce semantic fidelity and lead to inaccurate reconstructions and suboptimal super-resolution performance. To address this issue, we introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images. In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations. In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations based on the learned representations. Furthermore, we decompose the degradation-aware features into global semantics and local details branches, which are then injected into the diffusion denoising module to modulate the target generation. Our method effectively recovers semantically precise and photorealistic details, particularly under significant degradation conditions, demonstrating state-of-the-art performance across various benchmarks. Codes will be released at this https URL.

Abstract (translated)

扩散模型以其强大的生成能力而闻名,在解决现实世界的超分辨率挑战中发挥着关键作用。然而,这些模型通常只关注改善低分辨率图像的局部纹理,而忽视全局退化的影响,这可能导致语义保真度降低,从而导致不准确的重建和次优的超分辨率性能。为了解决这个问题,我们引入了一个新颖的两天平框架,该框架增强了扩散模型在低分辨率图像中识别内容和退化的能力。在第一阶段,我们采用无监督的对比学习来获得图像退化的表示。在第二阶段,我们将退化感知模块集成到简单的控制网络中,使得模型能够根据学习的表示对各种退化进行灵活的适应。此外,我们将退化感知的特征分解为全局语义和局部细节分支,然后注入到扩散去噪模块中,调节目标生成。我们的方法在显著的退化条件下有效地恢复了语义精确和逼真的细节,特别是在退化较大时,展示了在各种基准测试中的最先进性能。代码将在该https URL上发布。

URL

https://arxiv.org/abs/2404.00661

PDF

https://arxiv.org/pdf/2404.00661.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot