Abstract
One-shot semantic segmentation aims to segment query images given only ONE annotated support image of the same class. This task is challenging because target objects in the support and query images can be largely different in appearance and pose (i.e., intra-class variation). Prior works suggested that incorporating more annotated support images in few-shot settings boosts performances but increases costs due to additional manual labeling. In this paper, we propose a novel approach for ONE-shot semantic segmentation, called Group-On, which packs multiple query images in batches for the benefit of mutual knowledge support within the same category. Specifically, after coarse segmentation masks of the batch of queries are predicted, query-mask pairs act as pseudo support data to enhance mask predictions mutually, under the guidance of a simple Group-On Voting module. Comprehensive experiments on three standard benchmarks show that, in the ONE-shot setting, our Group-On approach significantly outperforms previous works by considerable margins. For example, on the COCO-20i dataset, we increase mIoU scores by 8.21% and 7.46% on ASNet and HSNet baselines, respectively. With only one support image, Group-On can be even competitive with the counterparts using 5 annotated support images.
Abstract (translated)
一次性的语义分割旨在对同一类别的仅有一个标注支持图像的查询图像进行分割。这个任务具有挑战性,因为支持图像和查询图像中的目标物体在 appearance 和 pose(即类内变化)上可能会有很大的差异。之前的 works 建议,在少样本设置中包含更多的标注支持图像可以提高性能,但增加成本是因为需要进行手动标注。在本文中,我们提出了一种名为 Group-On 的新颖的 ONE-shot语义分割方法,将多个查询图像打包成批次以促进同一类别内的相互知识支持。具体来说,在粗分割掩码预测之后,查询-掩码对充当伪支持数据,在简单 Group-On 投票模块的指导下相互增强 mask 预测。在三个标准基准上进行全面的实验表明,在 ONE-shot设置中,我们的 Group-On 方法显著超过了之前的工作。例如,在 COCO-20i 数据集上,我们将 mIoU 分数分别提高了 8.21% 和 7.46%。仅使用一个支持图像时,Group-On 甚至可以与使用 5 个标注支持图像的对照者相匹敌。
URL
https://arxiv.org/abs/2404.11871