Abstract
Photorealistic color retouching plays a vital role in visual content creation, yet manual retouching remains inaccessible to non-experts due to its reliance on specialized expertise. Reference-based methods offer a promising alternative by transferring the preset color of a reference image to a source image. However, these approaches often operate as novice learners, performing global color mappings derived from pixel-level statistics, without a true understanding of semantic context or human aesthetics. To address this issue, we propose SemiNFT, a Diffusion Transformer (DiT)-based retouching framework that mirrors the trajectory of human artistic training: beginning with rigid imitation and evolving into intuitive creation. Specifically, SemiNFT is first taught with paired triplets to acquire basic structural preservation and color mapping skills, and then advanced to reinforcement learning (RL) on unpaired data to cultivate nuanced aesthetic perception. Crucially, during the RL stage, to prevent catastrophic forgetting of old skills, we design a hybrid online-offline reward mechanism that anchors aesthetic exploration with structural review. % experiments Extensive experiments show that SemiNFT not only outperforms state-of-the-art methods on standard preset transfer benchmarks but also demonstrates remarkable intelligence in zero-shot tasks, such as black-and-white photo colorization and cross-domain (anime-to-photo) preset transfer. These results confirm that SemiNFT transcends simple statistical matching and achieves a sophisticated level of aesthetic comprehension. Our project can be found at this https URL.
Abstract (translated)
逼真的色彩修图在视觉内容创作中扮演着至关重要的角色,然而,由于依赖专门的技术知识,手动修图对于非专业人士来说仍然是难以触及的。基于参考的方法通过将参考图像的预设颜色转移到源图像上提供了一个有前景的选择。然而,这些方法往往操作如同初学者的学习过程,仅从像素级统计中进行全局色彩映射,而不理解语义上下文或人类美学。为了解决这个问题,我们提出了SemiNFT(半自主网络迁移框架),这是一个基于扩散变换器(DiT)的修图框架,它模拟了人类艺术训练的发展轨迹:从严格的模仿开始,逐渐演变为直观创造。具体来说,SemiNFT首先通过成对的三元组进行学习,以获得基本的结构保持和色彩映射技能,并进一步过渡到无配对数据上的强化学习(RL)阶段,以培养细微的美学感知能力。尤为重要的是,在RL阶段,为防止旧技能的灾难性遗忘,我们设计了一个混合在线-离线奖励机制,将美学探索与结构审查相结合。 实验结果表明,SemiNFT不仅在标准预设转移基准测试中超越了现有技术方法,还在零样本任务(如黑白照片上色和跨域转换[动漫到真实图片]的预设转移)方面表现出令人印象深刻的智能。这些结果证实了SemiNFT超越简单的统计匹配,并达到了一种复杂的美学理解水平。我们的项目可以在提供的链接地址找到。
URL
https://arxiv.org/abs/2602.08582