Abstract
In recent years, Transformers have garnered significant attention for Hyperspectral Image Classification (HSIC) due to their self-attention mechanism, which provides strong classification performance. However, these models face major challenges in computational efficiency, as their complexity increases quadratically with the sequence length. The Mamba architecture, leveraging a State Space Model, offers a more efficient alternative to Transformers. This paper introduces the Spatial-Spectral Morphological Mamba (MorpMamba) model. In the MorpMamba model, a token generation module first converts the Hyperspectral Image (HSI) patch into spatial-spectral tokens. These tokens are then processed by a morphology block, which computes structural and shape information using depthwise separable convolutional operations. The extracted information is enhanced in a feature enhancement module that adjusts the spatial and spectral tokens based on the center region of the HSI sample, allowing for effective information fusion within each block. Subsequently, the tokens are refined in a multi-head self-attention block to further improve the feature space. Finally, the combined information is fed into the state space block for classification and the creation of the ground truth map. Experiments on widely used Hyperspectral (HS) datasets demonstrate that the MorpMamba model outperforms (parametric efficiency) both CNN and Transformer models.
Abstract (translated)
近年来,由于其自注意力机制,Transformer模型因其在 Hyperspectral Image Classification (HSIC) 方面的表现而备受关注。然而,随着模型的复杂度随序列长度呈指数级增长,这些模型在计算效率上面临重大挑战。Mamba 架构作为一种更高效的替代方案,为 Transformers 提供了一个更有效的选择。本文介绍了 Spatial-Spectral Morphological Mamba (MorpMamba) 模型。在 MorpMamba 模型中,首先将 Hyperspectral Image (HSI) 补丁转换为空间-光谱令牌。然后,这些令牌通过一个形态学块进行处理,该模块利用卷积操作计算结构和形状信息。提取的信息在功能增强模块中进行增强,该模块根据 HSI 样本的中心的区域调整空间和光谱令牌,实现每个块内的有效信息融合。接下来,令牌在多头自注意力块中进行进一步优化,以进一步改善特征空间。最后,将组合信息输入状态空间块进行分类和创建真实地图。在广泛使用的 Hyperspectral (HS) 数据集的实验中,MorpMamba 模型表现出优越的性能(参数效率),超过了 CNN 和 Transformer 模型。
URL
https://arxiv.org/abs/2408.01372