Abstract
The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples. DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization. The core employs a multiscale depth-separable convolution block, capturing localized patterns across scales. This block is complemented by a spatial-channel squeeze and excitation (scSE) attention unit, modeling inter-dependencies between channels and spatial regions in feature maps. Additionally, additive attention gates refine segmentation by connecting encoder-decoder pathways. To augment the model, engineered features using Gabor filters for textural analysis, Sobel and Canny filters for edge detection are injected guided by semantic masks to expand the feature space strategically. Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities. Ablation studies highlight incremental benefits from attention blocks and feature injection. DAU-FI Net achieves state-of-the-art mean Intersection over Union (IoU) of 95.6% and 98.8% on the defect test set and benchmark respectively, surpassing prior methods by 8.9% and 12.6%, respectively. Ablation studies highlight incremental benefits from attention blocks and feature injection. The proposed architecture provides a robust solution, advancing semantic segmentation for multiclass problems with limited training data. Our sewer-culvert defects dataset, featuring pixel-level annotations, opens avenues for further research in this crucial domain. Overall, this work delivers key innovations in architecture, attention, and feature engineering to elevate semantic segmentation efficacy.
Abstract (translated)
提出的架构,双重关注U-Net与特征注入(DAU-FI Net),解决了在有限样本的语义分割数据集中出现的挑战。DAU-FI Net整合了多尺度空间通道关注机制和特征注入,以提高物体定位的精度。核心采用了多尺度深度可分离卷积模块,捕捉到尺度下的局部模式。这个模块由一个多尺度深度卷积和激活(scSE)关注单元补充,建模特征图通道和空间区域之间的相互依赖关系。此外,自适应注意力门通过连接编码器-解码器路径来优化分割。为了增加模型,使用Gabor滤波器提取文本分析特征,Sobel和Canny滤波器进行边缘检测的工程特征,通过语义掩码引导注入,扩展了特征空间。对具有挑战性的污水管道和干沟缺陷数据集以及基准数据集的全面实验证明,DAU-FI Net的性能优越。消融研究强调了自适应注意力和特征注入的增量益处。DAU-FI Net在缺陷测试集和基准数据集上分别实现了95.6%和98.8%的IoU,比之前方法分别提高了8.9%和12.6%。消融研究强调了自适应注意力和特征注入的增量益处。所提出的架构为带有有限训练数据的多样类问题提供了一个稳健的解决方案,提高了语义分割的效果。我们的污水管道和干沟缺陷数据集,具有像素级别的标注,为这个关键领域进一步的研究提供了途径。总的来说,这项工作在架构、注意力和特征工程方面取得了关键创新,提高了语义分割的有效性。
URL
https://arxiv.org/abs/2312.14053