Abstract
The labelling difficulty has been a longstanding problem in deep image matting. To escape from fine labels, this work explores using rough annotations such as trimaps coarsely indicating the foreground/background as supervision. We present that the cooperation between learned semantics from indicated known regions and proper assumed matting rules can help infer alpha values at transition areas. Inspired by the nonlocal principle in traditional image matting, we build a directional distance consistency loss (DDC loss) at each pixel neighborhood to constrain the alpha values conditioned on the input image. DDC loss forces the distance of similar pairs on the alpha matte and on its corresponding image to be consistent. In this way, the alpha values can be propagated from learned known regions to unknown transition areas. With only images and trimaps, a matting model can be trained under the supervision of a known loss and the proposed DDC loss. Experiments on AM-2K and P3M-10K dataset show that our paradigm achieves comparable performance with the fine-label-supervised baseline, while sometimes offers even more satisfying results than human-labelled ground truth. Code is available at \url{this https URL}.
Abstract (translated)
标签难度一直是一个长期存在于深度图像合成中的问题。为了逃避细粒度标签,这项工作探讨了使用粗粒度注释,如剪切面表示前景/背景,作为监督。我们发现,指示已知区域的预训练语义和学习到的语义之间的合作可以帮助推断在过渡区域中的alpha值。受到传统图像合成中非局部原则的启发,我们在每个像素邻域中建立了一个方向距离一致损失(DDC损失),用于约束基于输入图像的alpha值。DDC损失迫使alphamatte和其相应图像中类似对的距离保持一致。以这种方式,从预训练已知区域中传播alpha值到未知的过渡区域。仅使用图像和剪切面,可以在已知损失的监督下训练合成模型。在AM-2K和P3M-10K数据集上进行的实验证明,我们的范式与细粒度标签监督基线具有可比较的性能,有时甚至比人类标注的地面真实现得更令人满意。代码可在此处访问:\url{this <https://this URL>.}
URL
https://arxiv.org/abs/2408.10539