Abstract
We propose and study a task we name panoptic segmentation (PS). Panoptic segmentation unifies the typically distinct tasks of semantic segmentation (assign a class label to each pixel) and instance segmentation (detect and segment each object instance). The proposed task requires generating a coherent scene segmentation that is rich and complete, an important step toward real-world vision systems. While early work in computer vision addressed related image/scene parsing tasks, these are not currently popular, possibly due to lack of appropriate metrics or associated recognition challenges. To address this, we propose a novel panoptic quality (PQ) metric that captures performance for all classes (stuff and things) in an interpretable and unified manner. Using the proposed metric, we perform a rigorous study of both human and machine performance for PS on three existing datasets, revealing interesting insights about the task. The aim of our work is to revive the interest of the community in a more unified view of image segmentation.
Abstract (translated)
我们提出并研究了一个叫做泛光分割(PS)的任务。泛光分割统一了语义分割(为每个像素分配一个类标签)和实例分割(检测和分割每个对象实例)这两个典型的不同任务。提出的任务要求生成一个丰富而完整的连贯场景分割,这是向现实视觉系统迈出的重要一步。虽然计算机视觉的早期工作解决了相关的图像/场景分析任务,但这些任务目前并不流行,可能是由于缺乏适当的度量标准或相关的识别挑战。为了解决这一问题,我们提出了一种新的泛光质量(PQ)度量,它以一种可解释和统一的方式捕获所有类(事物和事物)的性能。利用所提出的度量标准,我们对现有的三个数据集上的ps的人和机器性能进行了严格的研究,揭示了有关该任务的有趣见解。我们的工作的目的是在一个更统一的图像分割的观点中恢复社区的兴趣。
URL
https://arxiv.org/abs/1801.00868