Abstract
Recent developments in deep domain adaptation have allowed knowledge transfer from a labeled source domain to an unlabeled target domain at the level of intermediate features or input pixels. We propose that advantages may be derived by combining them, in the form of different insights that lead to a novel design and complementary properties that result in better performance. At the feature level, inspired by insights from semi-supervised learning, we propose a classification-aware domain adversarial neural network that brings target examples into more classifiable regions of source domain. Next, we posit that computer vision insights are more amenable to injection at the pixel level. In particular, we use 3D geometry and image synthesis based on a generalized appearance flow to preserve identity across pose transformations, while using an attribute-conditioned CycleGAN to translate a single source into multiple target images that differ in lower-level properties such as lighting. Besides standard UDA benchmark, we validate on a novel and apt problem of car recognition in unlabeled surveillance images using labeled images from the web, handling explicitly specified, nameable factors of variation through pixel-level and implicit, unspecified factors through feature-level adaptation.
Abstract (translated)
深域自适应的最新发展使得知识从标记的源域转移到中间特征或输入像素级别的未标记的目标域。我们提出,优势可以通过将它们结合起来,以不同的见解的形式产生,从而形成一种新颖的设计和互补的特性,从而获得更好的性能。在特征层次上,基于半监督学习的启发,我们提出了一种分类感知域对抗神经网络,将目标实例引入到源域的更多可分类区域。接下来,我们假设计算机视觉洞察更容易在像素级别注入。特别是,我们使用基于通用外观流的三维几何图形和图像合成来在姿势转换中保持身份,同时使用属性条件循环将单个源转换为多个不同于低级属性(如照明)的目标图像。除了标准的UDA基准外,我们还验证了一个新的、适用的问题,即在未标记的监控图像中,使用来自网络的标记图像,通过像素级处理明确指定的、可命名的变化因素,以及通过特征级自适应处理隐含的、未指定的因素。
URL
https://arxiv.org/abs/1803.00068