Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors

Abstract
Abstract (translated)
URL
PDF

Abstract

Learning effective deep portrait matting models requires training data of both high quality and large quantity. Neither quality nor quantity can be easily met for portrait matting, however. Since the most accurate ground-truth portrait mattes are acquired in front of the green screen, it is almost impossible to harvest a large-scale portrait matting dataset in reality. This work shows that one can leverage text prompts and the recent Layer Diffusion model to generate high-quality portrait foregrounds and extract latent portrait mattes. However, the portrait mattes cannot be readily in use due to significant generation artifacts. Inspired by the connectivity priors observed in portrait images, that is, the border of portrait foregrounds always appears connected, a connectivity-aware approach is introduced to refine portrait mattes. Building on this, a large-scale portrait matting dataset is created, termed LD-Portrait-20K, with $20,051$ portrait foregrounds and high-quality alpha mattes. Extensive experiments demonstrated the value of the LD-Portrait-20K dataset, with models trained on it significantly outperforming those trained on other datasets. In addition, comparisons with the chroma keying algorithm and an ablation study on dataset capacity further confirmed the effectiveness of the proposed matte creation approach. Further, the dataset also contributes to state-of-the-art video portrait matting, implemented by simple video segmentation and a trimap-based image matting model trained on this dataset.

Abstract (translated)

学习有效的深度肖像抠图模型需要高质量且数量庞大的训练数据。然而，对于肖像抠图来说，要同时满足这两点却非常困难。最准确的肖像抠图通常是通过在绿幕前拍摄获得的，因此现实中很难收集到大规模的肖像抠图数据集。这项工作表明，可以通过利用文本提示和最近的Layer Diffusion模型生成高质量的肖像前景并提取潜在的肖像抠图。然而，由于显著的生成伪影，这些肖像抠图不能直接使用。受到肖像图像中连接性的启发，即肖像前景的边界总是连贯的，这里介绍了一种基于连接性感知的方法来优化肖像抠图。在此基础上，创建了一个大规模的数据集，命名为LD-Portrait-20K，包含了20,051个高质量的肖像前景和阿尔法遮罩。广泛的实验展示了LD-Portrait-20K数据集的价值，使用该数据集训练的模型在性能上显著优于其他数据集上训练的模型。此外，与色键技术进行对比以及关于数据集容量的消融研究进一步证实了所提出的创建肖像抠图方法的有效性。更进一步，这个数据集也对最先进的视频肖像抠图有所贡献，通过简单的视频分割和基于三值图像抠图模型实现，在该数据集上训练可以达到顶尖水平。

URL

https://arxiv.org/abs/2501.16147

PDF

https://arxiv.org/pdf/2501.16147.pdf

Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors

Abstract

Abstract (translated)

URL

PDF Copy

PDF