Paper Reading AI Learner

HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising

2024-04-15 11:59:19
Yang Liu, Jiahua Xiao, Yu Guo, Peilin Jiang, Haiwei Yang, Fei Wang

Abstract

Effectively discerning spatial-spectral dependencies in HSI denoising is crucial, but prevailing methods using convolution or transformers still face computational efficiency limitations. Recently, the emerging Selective State Space Model(Mamba) has risen with its nearly linear computational complexity in processing natural language sequences, which inspired us to explore its potential in handling long spectral sequences. In this paper, we propose HSIDMamba(HSDM), tailored to exploit the linear complexity for effectively capturing spatial-spectral dependencies in HSI denoising. In particular, HSDM comprises multiple Hyperspectral Continuous Scan Blocks, incorporating BCSM(Bidirectional Continuous Scanning Mechanism), scale residual, and spectral attention mechanisms to enhance the capture of long-range and local spatial-spectral information. BCSM strengthens spatial-spectral interactions by linking forward and backward scans and enhancing information from eight directions through SSM, significantly enhancing the perceptual capability of HSDM and improving denoising performance more effectively. Extensive evaluations against HSI denoising benchmarks validate the superior performance of HSDM, achieving state-of-the-art results in performance and surpassing the efficiency of the latest transformer architectures by $30\%$.

Abstract (translated)

有效地区分HSI去噪中的空间-频谱依赖关系至关重要,但使用卷积或Transformer的方法仍然存在计算效率限制。最近,随着Selective State Space Model(Mamba)的出现,它以其近线性计算复杂度在处理自然语言序列方面得到了提升,这激发了我们探索其在处理长时谱序列方面的潜在能力的兴趣。在本文中,我们提出了HSIDMamba(HSDM),专门用于有效地捕捉HSI去噪中的空间-频谱依赖关系。 具体来说,HSDM由多个超光谱连续扫描块组成,包括双向连续扫描机制(BCSM)、扩展残差和谱注意机制,以增强对长距离和局部空间-频谱信息的捕捉。BCSM通过将前向和后向扫描连接起来,增强了空间-频谱交互,显著提高了HSDM的感知能力,并有效地改善了去噪效果。 通过对HSI去噪基准测试的广泛评估证实了HSDM的优越性能,达到了与最先进的Transformer架构相同的性能水平,并比其效率高30%。

URL

https://arxiv.org/abs/2404.09697

PDF

https://arxiv.org/pdf/2404.09697.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot