Paper Reading AI Learner

WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification

2024-08-02 12:44:07
Muhammad Ahmad, Muhammad Usama, Manual Mazzara

Abstract

Hyperspectral Imaging (HSI) has proven to be a powerful tool for capturing detailed spectral and spatial information across diverse applications. Despite the advancements in Deep Learning (DL) and Transformer architectures for HSI Classification (HSIC), challenges such as computational efficiency and the need for extensive labeled data persist. This paper introduces WaveMamba, a novel approach that integrates wavelet transformation with the Spatial-Spectral Mamba architecture to enhance HSIC. WaveMamba captures both local texture patterns and global contextual relationships in an end-to-end trainable model. The Wavelet-based enhanced features are then processed through the state-space architecture to model spatial-spectral relationships and temporal dependencies. The experimental results indicate that WaveMamba surpasses existing models, achieving an accuracy improvement of 4.5\% on the University of Houston dataset and a 2.0\% increase on the Pavia University dataset. These findings validate its effectiveness in addressing the complex data interactions inherent in HSIs.

Abstract (translated)

超光谱成像(HSI)已被证明是一种强大的工具,可用于捕捉各种应用中的详细光谱和空间信息。尽管在深度学习和Transformer架构的进步下,HSIC分类(HSIC)中的挑战(如计算效率和需要大量带标签数据)仍然存在。本文介绍了一种名为WaveMamba的新方法,它将 wavelet 变换与 Spatial-Spectral Mamba 架构集成在一起,以增强 HSIC。WaveMamba在端到端可训练模型中捕捉到局部纹理模式和全局上下文关系。然后通过状态空间架构对增强特征进行处理,以建模空间-光谱关系和时间依赖。实验结果表明,WaveMamba超越了现有模型,在大学休斯顿数据集上的准确度提高了4.5%,在皮亚察大学数据集上的准确度提高了2.0%。这些发现证实了其在解决HSIs固有复杂数据交互方面的有效性。

URL

https://arxiv.org/abs/2408.01231

PDF

https://arxiv.org/pdf/2408.01231.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot