Abstract
The scarcity of labeled data in real-world scenarios is a critical bottleneck of deep learning's effectiveness. Semi-supervised semantic segmentation has been a typical solution to achieve a desirable tradeoff between annotation cost and segmentation performance. However, previous approaches, whether based on consistency regularization or self-training, tend to neglect the contextual knowledge embedded within inter-pixel relations. This negligence leads to suboptimal performance and limited generalization. In this paper, we propose a novel approach IPixMatch designed to mine the neglected but valuable Inter-Pixel information for semi-supervised learning. Specifically, IPixMatch is constructed as an extension of the standard teacher-student network, incorporating additional loss terms to capture inter-pixel relations. It shines in low-data regimes by efficiently leveraging the limited labeled data and extracting maximum utility from the available unlabeled data. Furthermore, IPixMatch can be integrated seamlessly into most teacher-student frameworks without the need of model modification or adding additional components. Our straightforward IPixMatch method demonstrates consistent performance improvements across various benchmark datasets under different partitioning protocols.
Abstract (translated)
在现实场景中,有标签数据的稀缺是深度学习效果的一个关键瓶颈。半监督语义分割是一种常见的解决方案,以实现注释成本和分割性能之间的理想平衡。然而,以前的方法,无论是基于一致性正则化还是自训练,都倾向于忽视内部像素关系中固有的上下文知识。这种疏忽导致 suboptimal 的性能和有限的泛化能力。在本文中,我们提出了一种名为 IPixMatch 的新颖方法,旨在通过半监督学习挖掘被忽视但有益的跨像素信息。具体来说,IPixMatch 是一个标准的老师-学生网络的扩展,包括额外的损失项来捕捉跨像素关系。它在低数据量的情况下通过有效地利用有限的标记数据并从可用未标记数据中挖掘最大效用来闪耀。此外,IPixMatch 可以无缝地集成到大多数老师-学生框架中,而无需对模型进行修改或添加额外组件。我们直接使用 IPixMatch 的方法在不同的分片协议下展示了 consistent 的性能提升。
URL
https://arxiv.org/abs/2404.18891