Abstract
Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information. Recent studies are primarily dedicated to designing Transformer-based architectures for spatial-spectral long-range dependencies modeling, which is computationally expensive with quadratic complexity. Selective structured state space model (Mamba), which is efficient for modeling long-range dependencies with linear complexity, has recently shown promising progress. However, its potential in hyperspectral image processing that requires handling numerous spectral bands has not yet been explored. In this paper, we innovatively propose S$^2$Mamba, a spatial-spectral state space model for hyperspectral image classification, to excavate spatial-spectral contextual features, resulting in more efficient and accurate land cover analysis. In S$^2$Mamba, two selective structured state space models through different dimensions are designed for feature extraction, one for spatial, and the other for spectral, along with a spatial-spectral mixture gate for optimal fusion. More specifically, S$^2$Mamba first captures spatial contextual relations by interacting each pixel with its adjacent through a Patch Cross Scanning module and then explores semantic information from continuous spectral bands through a Bi-directional Spectral Scanning module. Considering the distinct expertise of the two attributes in homogenous and complicated texture scenes, we realize the Spatial-spectral Mixture Gate by a group of learnable matrices, allowing for the adaptive incorporation of representations learned across different dimensions. Extensive experiments conducted on HSI classification benchmarks demonstrate the superiority and prospect of S$^2$Mamba. The code will be available at: this https URL.
Abstract (translated)
使用 hyperspectral 图像(HSI)进行土地覆盖分析仍然是一个开放问题,因为它们的低空间分辨率和高复杂度。最近的研究主要集中在设计基于 Transformer 的空间-光谱长距离依赖关系建模架构,这具有计算上的平方复杂度。选择性结构化状态空间模型(Mamba)在模擬长距离依赖关系方面具有较低的复杂度,并且最近已经展示了有希望的前进方向。然而,它对超分辨率图像处理所需的空间-光谱上下文特征的潜在能力尚未被探索。在本文中,我们创新性地提出了 S$^2$Mamba,一种用于超分辨率图像分类的空间-光谱状态空间模型,以揭示空间-光谱上下文特征,从而实现更高效和准确的陆地覆盖分析。在 S$^2$Mamba 中,通过不同的维度设计两个选择性结构化状态空间模型进行特征提取,一个用于空间信息,另一个用于光谱信息,并使用空间-光谱混合门进行最优融合。具体来说,S$^2$Mamba 首先通过 Patch Cross Scanning 模块与相邻像素相互作用,捕捉空间上下文关系,然后通过 Bi-Directional Spectral Scanning 模块探索连续光谱带中的语义信息。考虑到两种属性在均匀和复杂纹理场景中的独特 expertise,我们通过一组可学习矩阵实现了空间-光谱混合门,允许在不同维度上学习到的表示的适应性整合。在 HSI 分类基准上进行的大量实验证明 S$^2$Mamba 的优越性和前景。代码将在此处提供:https:// this URL。
URL
https://arxiv.org/abs/2404.18213