Paper Reading AI Learner

Recurrent Structure Attention Guidance for Depth Super-Resolution

2023-01-31 05:18:34
Jiayi Yuan, Haobo Jiang, Xiang Li, Jianjun Qian, Jun Li, Jian Yang

Abstract

Image guidance is an effective strategy for depth super-resolution. Generally, most existing methods employ hand-crafted operators to decompose the high-frequency (HF) and low-frequency (LF) ingredients from low-resolution depth maps and guide the HF ingredients by directly concatenating them with image features. However, the hand-designed operators usually cause inferior HF maps (e.g., distorted or structurally missing) due to the diverse appearance of complex depth maps. Moreover, the direct concatenation often results in weak guidance because not all image features have a positive effect on the HF maps. In this paper, we develop a recurrent structure attention guided (RSAG) framework, consisting of two important parts. First, we introduce a deep contrastive network with multi-scale filters for adaptive frequency-domain separation, which adopts contrastive networks from large filters to small ones to calculate the pixel contrasts for adaptive high-quality HF predictions. Second, instead of the coarse concatenation guidance, we propose a recurrent structure attention block, which iteratively utilizes the latest depth estimation and the image features to jointly select clear patterns and boundaries, aiming at providing refined guidance for accurate depth recovery. In addition, we fuse the features of HF maps to enhance the edge structures in the decomposed LF maps. Extensive experiments show that our approach obtains superior performance compared with state-of-the-art depth super-resolution methods.

Abstract (translated)

图像引导是深度超分辨率的一种有效策略。一般来说,大多数现有方法都使用手工设计的操作员从低分辨率深度地图中分解高频和低频成分,并将这些成分直接与图像特征连接起来,以引导高频成分。然而,手工设计的操作员通常会导致低质量的高频地图(例如,扭曲或结构缺失),因为它们复杂的深度地图呈现出不同的形状。此外,直接拼接往往会导致弱引导,因为不是所有的图像特征都对高频地图产生了积极影响。在本文中,我们开发了一种循环结构注意引导框架,由两个重要部分组成。首先,我们介绍了一种深度对比网络,并结合多尺度滤波器,以自适应频率域分离,该网络采用从大型滤波器到小型滤波器的对比网络,以计算自适应高质量高频预测的像素对比度。其次,我们而不是粗拼接引导,提出了一种循环结构注意块,该块迭代利用最新的深度估计和图像特征,共同选择清晰的模式和边界,旨在提供准确的深度恢复 refine guidance。此外,我们还融合HF地图的特征,以增强分解的LF地图的边缘结构。广泛的实验表明,我们的方法与最先进的深度超分辨率方法相比,获得了更好的表现。

URL

https://arxiv.org/abs/2301.13419

PDF

https://arxiv.org/pdf/2301.13419.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot