Paper Reading AI Learner

Generating objects going well with the surroundings

2018-07-09 03:26:10
Jeesoo Kim, Jaeyoung Yoo, Jangho Kim, Nojun Kwak

Abstract

Since the generative adversarial network has made a breakthrough in the image generation problem, lots of researches on its applications have been studied such as image restoration, style transfer and image completion. However, there have been few researches generating objects in uncontrolled real-world environments. In this paper, we propose a novel approach for image generation in real-world scenes. The overall architecture consists of two different networks each of which completes the shape of the generating object and paints the context on it respectively. Using a subnetwork proposed in a precedent work of image completion, our model make the shape of an object. Unlike the approaches used in the image completion problem, details of trained objects are encoded into a latent variable by an additional subnetwork, resulting in a better quality of the generated objects. We evaluated our method using KITTI and City-scape datasets, which are widely used for object detection and image segmentation problems. The adequacy of the generated images by the proposed method has also been evaluated using a widely utilized object detection algorithm.

Abstract (translated)

由于生成对抗网络在图像生成问题上取得了突破,因此对其应用进行了大量研究,如图像恢复,样式转移和图像完成。然而,很少有研究在不受控制的现实环境中生成对象。在本文中,我们提出了一种在现实场景中生成图像的新方法。整体架构由两个不同的网络组成,每个网络完成生成对象的形状并分别在其上绘制上下文。使用在先前的图像完成工作中提出的子网,我们的模型形成了一个对象的形状。与图像完成问题中使用的方法不同,训练对象的细节通过附加子网编码为潜在变量,从而产生更好的生成对象质量。我们使用KITTI和City-scape数据集评估了我们的方法,这些数据集广泛用于对象检测和图像分割问题。还使用广泛使用的对象检测算法评估了所提出的方法生成的图像的充分性。

URL

https://arxiv.org/abs/1807.02925

PDF

https://arxiv.org/pdf/1807.02925.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot