Abstract
Underwater image enhancement (UIE) is a critical preprocessing step for marine vision applications, where wavelength-dependent attenuation causes severe content degradation and color distortion. While recent state space models like Mamba show potential for long-range dependency modeling, their unfolding operations and fixed scan paths on 1D sequences fail to adapt to local object semantics and global relation modeling, limiting their efficacy in complex underwater environments. To address this, we enhance conventional Mamba with the sorting-based scanning mechanism that dynamically reorders scanning sequences based on statistical distribution of spatial correlation of all pixels. In this way, it encourages the network to prioritize the most informative components--structural and semantic features. Upon building this mechanism, we devise a Visually Self-adaptive State Block (VSSB) that harmonizes dynamic sorting of Mamba with input-dependent dynamic convolution, enabling coherent integration of global context and local relational cues. This exquisite design helps eliminate global focus bias, especially for widely distributed contents, which greatly weakens the statistical frequency. For robust feature extraction and refinement, we design a cross-feature bridge (CFB) to adaptively fuse multi-scale representations. These efforts compose the novel relation-driven Mamba framework for effective UIE (RD-UIE). Extensive experiments on underwater enhancement benchmarks demonstrate RD-UIE outperforms the state-of-the-art approach WMamba in both quantitative metrics and visual fidelity, averagely achieving 0.55 dB performance gain on the three benchmarks. Our code is available at this https URL
Abstract (translated)
水下图像增强(UIE)是海洋视觉应用中的一个关键预处理步骤,其中波长依赖的衰减会导致严重的内容退化和颜色失真。尽管最近的状态空间模型如Mamba在长期依赖性建模方面显示出潜力,但它们的操作展开过程以及固定的一维序列扫描路径无法适应局部对象语义及全局关系建模的需求,在复杂的水下环境中其有效性受到限制。为解决这一问题,我们通过基于排序的扫描机制对传统的Mamba进行了增强,该机制可根据所有像素的空间相关性统计分布动态重新排列扫描顺序。这样一来,它鼓励网络优先处理最具信息量的组成部分——结构和语义特征。在此基础上,我们设计了一种视觉自适应状态块(VSSB),将Mamba的动态排序与基于输入的动态卷积相结合,从而实现了全局上下文与局部关系线索的一致融合。这一精妙的设计有助于消除全局关注偏差,特别是对于广泛分布的内容而言,这大大削弱了统计频率。为了实现稳健的功能提取和细化,我们设计了一种跨特征桥(CFB),以自适应地融合多尺度表示。这些努力共同构成了用于有效UIE的新颖的关系驱动Mamba框架(RD-UIE)。在水下增强基准测试上的大量实验表明,RD-UIE在定量指标和视觉保真度方面均优于当前最先进的方法WMamba,在三个基准上平均实现了0.55 dB的性能提升。我们的代码可在提供的链接处获取。
URL
https://arxiv.org/abs/2505.01224