Paper Reading AI Learner

Burst Super-Resolution with Diffusion Models for Improving Perceptual Quality

2024-03-28 13:58:05
Kyotaro Tokoro, Kazutoshi Akita, Norimichi Ukita


While burst LR images are useful for improving the SR image quality compared with a single LR image, prior SR networks accepting the burst LR images are trained in a deterministic manner, which is known to produce a blurry SR image. In addition, it is difficult to perfectly align the burst LR images, making the SR image more blurry. Since such blurry images are perceptually degraded, we aim to reconstruct the sharp high-fidelity boundaries. Such high-fidelity images can be reconstructed by diffusion models. However, prior SR methods using the diffusion model are not properly optimized for the burst SR task. Specifically, the reverse process starting from a random sample is not optimized for image enhancement and restoration methods, including burst SR. In our proposed method, on the other hand, burst LR features are used to reconstruct the initial burst SR image that is fed into an intermediate step in the diffusion model. This reverse process from the intermediate step 1) skips diffusion steps for reconstructing the global structure of the image and 2) focuses on steps for refining detailed textures. Our experimental results demonstrate that our method can improve the scores of the perceptual quality metrics. Code: this https URL

Abstract (translated)

虽然 burst LR 图像在改善与单个 LR 图像的 SR 图像质量方面是有用的,但接受 burst LR 图像的早期 SR 网络是在确定性方式下训练的,这已经被知道会生成模糊的 SR 图像。此外,很难完美对齐 burst LR 图像,使得 SR 图像变得更模糊。由于这些模糊的图像在感知上退化,我们试图通过扩散模型重构尖锐的高保真度边界。通过扩散模型可以重构高保真度图像。然而,早期 SR 方法使用扩散模型并未对 burst SR 任务进行优化。具体来说,从随机样本开始的反向过程没有优化图像增强和恢复方法,包括 burst SR。在我们的方法中,另一方面,使用 burst LR 特征重构输入到扩散模型中间步骤的初始 burst SR 图像。这种反向过程从中间步骤 1) 跳过扩散步骤以重构图像的整体结构,2) 专注于微纹理的优化步骤。我们的实验结果表明,我们的方法可以提高感知质量指标的得分。代码:https:// this URL



3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot