Paper Reading AI Learner

Spectral-Spatial Mamba for Hyperspectral Image Classification

2024-04-29 03:36:05
Lingbo Huang, Yushi Chen, Xin He

Abstract

Recently, deep learning models have achieved excellent performance in hyperspectral image (HSI) classification. Among the many deep models, Transformer has gradually attracted interest for its excellence in modeling the long-range dependencies of spatial-spectral features in HSI. However, Transformer has the problem of quadratic computational complexity due to the self-attention mechanism, which is heavier than other models and thus has limited adoption in HSI processing. Fortunately, the recently emerging state space model-based Mamba shows great computational efficiency while achieving the modeling power of Transformers. Therefore, in this paper, we make a preliminary attempt to apply the Mamba to HSI classification, leading to the proposed spectral-spatial Mamba (SS-Mamba). Specifically, the proposed SS-Mamba mainly consists of spectral-spatial token generation module and several stacked spectral-spatial Mamba blocks. Firstly, the token generation module converts any given HSI cube to spatial and spectral tokens as sequences. And then these tokens are sent to stacked spectral-spatial mamba blocks (SS-MB). Each SS-MB block consists of two basic mamba blocks and a spectral-spatial feature enhancement module. The spatial and spectral tokens are processed separately by the two basic mamba blocks, respectively. Besides, the feature enhancement module modulates spatial and spectral tokens using HSI sample's center region information. In this way, the spectral and spatial tokens cooperate with each other and achieve information fusion within each block. The experimental results conducted on widely used HSI datasets reveal that the proposed model achieves competitive results compared with the state-of-the-art methods. The Mamba-based method opens a new window for HSI classification.

Abstract (translated)

近年来,在超光谱图像(HSI)分类中,深度学习模型已经取得了非常好的性能。在众多深度模型中,Transformer因其 在HSI中建模空间光谱特征的长距离依赖而逐渐受到关注。然而,由于自注意力机制,Transformer 的计算复杂度为二次方,这使得它对HSI处理的应用受到了限制。幸运的是,基于状态空间模型的Mamba模型在计算效率上表现出色,同时具有Transformer的建模能力。因此,在本文中,我们尝试将Mamba应用于HSI分类,导致提出了光谱空间Mamba(SS-Mamba)。具体来说,SS-Mamba主要由光谱空间令牌生成模块和几个堆叠的光谱空间Mamba块组成。首先,令牌生成模块将任意HSI立方体转换为空间和光谱令牌序列。然后,这些令牌被发送到堆叠的光谱空间Mamba块(SS-MB)。每个SS-MB块由两个基本Mamba块和一个光谱空间特征增强模块组成。空间和光谱令牌分别通过基本Mamba块进行处理。此外,特征增强模块通过HSI样本中心区域信息对空间和光谱令牌进行调整。以这种方式,空间和光谱令牌相互合作,在每个块内实现信息融合。在广泛使用的HSI数据集上进行的实验结果表明,与最先进的方法相比,所提出的模型具有竞争力的结果。基于Mamba的方法为HSI分类打开了一个新的窗口。

URL

https://arxiv.org/abs/2404.18401

PDF

https://arxiv.org/pdf/2404.18401.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot