Abstract
We introduce in-context matting, a novel task setting of image matting. Given a reference image of a certain foreground and guided priors such as points, scribbles, and masks, in-context matting enables automatic alpha estimation on a batch of target images of the same foreground category, without additional auxiliary input. This setting marries good performance in auxiliary input-based matting and ease of use in automatic matting, which finds a good trade-off between customization and automation. To overcome the key challenge of accurate foreground matching, we introduce IconMatting, an in-context matting model built upon a pre-trained text-to-image diffusion model. Conditioned on inter- and intra-similarity matching, IconMatting can make full use of reference context to generate accurate target alpha mattes. To benchmark the task, we also introduce a novel testing dataset ICM-$57$, covering 57 groups of real-world images. Quantitative and qualitative results on the ICM-57 testing set show that IconMatting rivals the accuracy of trimap-based matting while retaining the automation level akin to automatic matting. Code is available at this https URL
Abstract (translated)
我们提出了一个名为"in-context matting"的新图像配对任务,这是一种图像配对任务,其旨在解决基于辅助输入的图像配对中的关键挑战 - 准确的目标匹配。在本文中,我们将介绍一种基于预训练文本到图像扩散模型的"in-context matting"模型,该模型可以在不需要额外辅助输入的情况下,对同一目标类别的目标图像进行自动alpha估计。这一设置将辅助输入驱动的图像配对和自动配对中易用性的优势相结合,找到了定制化和自动化之间的良好平衡。为了克服准确目标匹配的关键挑战,我们引入了"IconMatting"模型,这是一种基于预训练文本到图像扩散模型的"in-context matting"模型。通过条件处理互相似性和内部相似性,IconMatting可以充分利用参考上下文生成准确的靶alpha Mattes。为了验证该任务,我们还引入了"ICM-$57"测试数据集,涵盖了57个真实世界图像组。在ICM-57测试集中的定量和定性结果表明,IconMatting与基于trimap的图像配对中的准确性相媲美,同时保留了自动配对中类似于自动配对的自动化水平。代码可以从该链接https://该链接中获取。
URL
https://arxiv.org/abs/2403.15789