Paper Reading AI Learner

Coarse-Fine Spectral-Aware Deformable Convolution For Hyperspectral Image Reconstruction

2024-06-18 15:15:12
Jincheng Yang, Lishun Wang, Miao Cao, Huan Wang, Yinping Zhao, Xin Yuan

Abstract

We study the inverse problem of Coded Aperture Snapshot Spectral Imaging (CASSI), which captures a spatial-spectral data cube using snapshot 2D measurements and uses algorithms to reconstruct 3D hyperspectral images (HSI). However, current methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies and non-local similarities. The recently popular Transformer-based methods are poorly deployed on downstream tasks due to the high computational cost caused by self-attention. In this paper, we propose Coarse-Fine Spectral-Aware Deformable Convolution Network (CFSDCN), applying deformable convolutional networks (DCN) to this task for the first time. Considering the sparsity of HSI, we design a deformable convolution module that exploits its deformability to capture long-range dependencies and non-local similarities. In addition, we propose a new spectral information interaction module that considers both coarse-grained and fine-grained spectral similarities. Extensive experiments demonstrate that our CFSDCN significantly outperforms previous state-of-the-art (SOTA) methods on both simulated and real HSI datasets.

Abstract (translated)

我们研究了编码孔径 snapshot 光谱成像(CASSI)的逆问题,该逆问题通过捕获快照 2D 测量来构建一个空间 - 光谱数据立方,并使用算法来重构 3D 超分辨率图像(HSI)。然而,基于卷积神经网络(CNN)的现有方法很难捕捉长距离依赖关系和非局部相似性。最近流行的基于 Transformer 的方法在下游任务上表现不佳,因为自注意力引起的计算成本太高。在本文中,我们提出了粗略感知光谱感知平移卷积神经网络(CFSDCN),将平移卷积神经网络(DCN)应用于该任务,这是第一次这样做。考虑到 HSI 的稀疏性,我们设计了一个平移卷积模块,利用其可塑性来捕捉长距离依赖关系和非局部相似性。此外,我们提出了一种新的光谱信息交互模块,考虑粗粒度和细粒度的光谱相似性。大量实验证明,我们的 CFSDCN 在模拟和真实 HSI 数据集上显著优于最先进的(SOTA)方法。

URL

https://arxiv.org/abs/2406.12703

PDF

https://arxiv.org/pdf/2406.12703.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot