Paper Reading AI Learner

Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution

2024-03-16 13:44:42
Zhiheng Li, Muheng Li, Jixuan Fan, Lei Chen, Yansong Tang, Jie Zhou, Jiwen Lu

Abstract

Scale arbitrary super-resolution based on implicit image function gains increasing popularity since it can better represent the visual world in a continuous manner. However, existing scale arbitrary works are trained and evaluated on simulated datasets, where low-resolution images are generated from their ground truths by the simplest bicubic downsampling. These models exhibit limited generalization to real-world scenarios due to the greater complexity of real-world degradations. To address this issue, we build a RealArbiSR dataset, a new real-world super-resolution benchmark with both integer and non-integer scaling factors for the training and evaluation of real-world scale arbitrary super-resolution. Moreover, we propose a Dual-level Deformable Implicit Representation (DDIR) to solve real-world scale arbitrary super-resolution. Specifically, we design the appearance embedding and deformation field to handle both image-level and pixel-level deformations caused by real-world degradations. The appearance embedding models the characteristics of low-resolution inputs to deal with photometric variations at different scales, and the pixel-based deformation field learns RGB differences which result from the deviations between the real-world and simulated degradations at arbitrary coordinates. Extensive experiments show our trained model achieves state-of-the-art performance on the RealArbiSR and RealSR benchmarks for real-world scale arbitrary super-resolution. Our dataset as well as source code will be publicly available.

Abstract (translated)

基于隐式图像函数增益的任意超分辨率已经在视觉世界中以连续的方式更好地表示视觉世界而变得越来越受欢迎。然而,现有的基于模拟数据集训练和评估的超分辨率模型在真实世界场景中的泛化能力有限,因为真实世界退化更加复杂。为解决这个问题,我们构建了RealArbiSR数据集,一个新的真实世界超分辨率基准,具有整数和非整数缩放因子,用于真实世界规模任意超分辨率。此外,我们提出了一个双层可塑隐式表示(DDIR)来解决真实世界规模任意超分辨率。具体来说,我们设计了一个能够处理真实世界退化引起的图像级别和像素级别变形的外观嵌入和变形场。外观嵌入模型对低分辨率输入的特性进行了建模,以处理不同尺度下的光度变化;像素级别的变形场学习真实世界和模拟退化之间的任意坐标差值产生的RGB差异。大量实验证明,我们训练的模型在真实世界规模任意超分辨率基准上取得了最先进的性能。我们的数据集以及源代码将公开发布。

URL

https://arxiv.org/abs/2403.10925

PDF

https://arxiv.org/pdf/2403.10925.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot