Paper Reading AI Learner

Purifying Naturalistic Images through a Real-time Style Transfer Semantics Network

2019-03-14 05:33:08
Tongtong Zhao, Yuxiao Yan, Ibrahim Shehi Shehu, Xianping Fu, Huibing Wang

Abstract

Recently, the progress of learning-by-synthesis has proposed a training model for synthetic images, which can effectively reduce the cost of human and material resources. However, due to the different distribution of synthetic images compared to real images, the desired performance cannot still be achieved. Real images consist of multiple forms of light orientation, while synthetic images consist of a uniform light orientation. These features are considered to be characteristic of outdoor and indoor scenes, respectively. To solve this problem, the previous method learned a model to improve the realism of the synthetic image. Different from the previous methods, this paper takes the first step to purify real images. Through the style transfer task, the distribution of outdoor real images is converted into indoor synthetic images, thereby reducing the influence of light. Therefore, this paper proposes a real-time style transfer network that preserves image content information (eg, gaze direction, pupil center position) of an input image (real image) while inferring style information (eg, image color structure, semantic features) of style image (synthetic image). In addition, the network accelerates the convergence speed of the model and adapts to multi-scale images. Experiments were performed using mixed studies (qualitative and quantitative) methods to demonstrate the possibility of purifying real images in complex directions. Qualitatively, it compares the proposed method with the available methods in a series of indoor and outdoor scenarios of the LPW dataset. In quantitative terms, it evaluates the purified image by training a gaze estimation model on the cross data set. The results show a significant improvement over the baseline method compared to the raw real image.

Abstract (translated)

近年来,合成学习的发展提出了一种合成图像的训练模型,有效地降低了人力物力资源的成本。然而,由于合成图像与真实图像的分布不同,仍然无法达到预期的性能。真实图像由多种形式的光方向组成,而合成图像则由均匀的光方向组成。这些特征分别被认为是室外和室内场景的特征。为了解决这个问题,以前的方法学习了一个模型来提高合成图像的真实性。与以往的方法不同,本文首先对真实图像进行了净化处理。通过样式转换任务,将室外真实图像的分布转化为室内合成图像,从而减少光线的影响。因此,本文提出了一种实时的风格传递网络,在推断风格图像(合成图像)的风格信息(如图像颜色结构、语义特征)的同时,保留输入图像(真实图像)的图像内容信息(如注视方向、瞳孔中心位置)。此外,该网络加快了模型的收敛速度,适应于多尺度图像。实验采用混合研究(定性和定量)方法,以证明在复杂方向上净化真实图像的可能性。定性地比较了该方法与LPW数据集的一系列室内外场景中的可用方法。在定量方面,通过在交叉数据集上训练一个注视估计模型,对纯化后的图像进行评价。结果表明,与原始真实图像相比,基线方法有了显著的改进。

URL

https://arxiv.org/abs/1903.05820

PDF

https://arxiv.org/pdf/1903.05820.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot