Abstract
Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity. More precisely, semantic areas, object instances, and semantic parts are predicted simultaneously. In this paper, we present our Joint Panoptic Part Fusion (JPPF) that combines the three individual segmentations effectively to obtain a panoptic-part segmentation. Two aspects are of utmost importance for this: First, a unified model for the three problems is desired that allows for mutually improved and consistent representation learning. Second, balancing the combination so that it gives equal importance to all individual results during fusion. Our proposed JPPF is parameter-free and dynamically balances its input. The method is evaluated and compared on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets in terms of PartPQ and Part-Whole Quality (PWQ). In extensive experiments, we verify the importance of our fair fusion, highlight its most significant impact for areas that can be further segmented into parts, and demonstrate the generalization capabilities of our design without fine-tuning on 5 additional datasets.
Abstract (translated)
部分感知透视分割是一个计算机视觉问题,旨在在多级粒度上提供对场景的语义理解。更具体地说,同时预测语义区域、目标实例和语义部分。在本文中,我们提出了我们的联合透视部分融合(JPPF),有效地将三个分割结合在一起以获得透视部分分割。其中两个方面对于这个问题尤为重要:第一,希望有一个统一的模型,允许在学习和理解三个问题方面相互改进和一致的表示学习。第二,平衡组合以使它在融合过程中给每个单独结果同等重要性。我们提出的JPPF是参数free的,并且动态地平衡其输入。该方法在Cityscapes透视部分(CPP)和Pascal透视部分(PPP)数据集上进行了评估和比较,以考虑PartPQ和Part-Whole Quality(PWQ)。在广泛的实验中,我们验证了我们对公平融合的重视,强调了其在可以进一步细分的领域中的最显著影响,并展示了我们设计的通用性的能力,无需在5个额外的数据集上进行微调。
URL
https://arxiv.org/abs/2311.18618