Abstract
Style transfer usually refers to the task of applying color and texture information from a specific style image to a given content image while preserving the structure of the latter. Here we tackle the more generic problem of semantic style transfer: given two unpaired collections of images, we aim to learn a mapping between the corpus-level style of each collection, while preserving semantic content shared across the two domains. We introduce XGAN ("Cross-GAN"), a dual adversarial autoencoder, which captures a shared representation of the common domain semantic content in an unsupervised way, while jointly learning the domain-to-domain image translations in both directions. We exploit ideas from the domain adaptation literature and define a semantic consistency loss which encourages the model to preserve semantics in the learned embedding space. We report promising qualitative results for the task of face-to-cartoon translation. The cartoon dataset, CartoonSet, we collected for this purpose is publicly available at google.github.io/cartoonset/ as a new benchmark for semantic style transfer.
Abstract (translated)
样式转移通常是指将颜色和纹理信息从特定样式图像应用到给定内容图像同时保留后者的结构的任务。在这里,我们解决了语义风格转移的更普遍的问题:给定两个不成对的图像集合,我们的目标是学习每个集合的语料库级样式之间的映射,同时保留在两个域之间共享的语义内容。我们介绍了XGAN(“Cross-GAN”),这是一种双重对抗自动编码器,它以无人监督的方式捕获公共域语义内容的共享表示,同时共同学习两个方向上的域到域图像翻译。我们利用领域适应文献中的思想来定义语义一致性损失,这种语义一致性损失鼓励模型在学习的嵌入空间中保留语义。我们报告了面对面翻译任务的有希望的定性结果。我们为此目的收集的漫画数据集CartoonSet在google.github.io/cartoonset/上公开发布,作为语义风格转移的新基准。
URL
https://arxiv.org/abs/1711.05139