Paper Reading AI Learner

Perceptual Adversarial Networks for Image-to-Image Transformation

2019-04-03 02:44:53
Chaoyue Wang, Chang Xu, Chaohui Wang, Dacheng Tao

Abstract

In this paper, we propose a principled Perceptual Adversarial Networks (PAN) for image-to-image transformation tasks. Unlike existing application-specific algorithms, PAN provides a generic framework of learning mapping relationship between paired images (Fig. 1), such as mapping a rainy image to its de-rained counterpart, object edges to its photo, semantic labels to a scenes image, etc. The proposed PAN consists of two feed-forward convolutional neural networks (CNNs), the image transformation network T and the discriminative network D. Through combining the generative adversarial loss and the proposed perceptual adversarial loss, these two networks can be trained alternately to solve image-to-image transformation tasks. Among them, the hidden layers and output of the discriminative network D are upgraded to continually and automatically discover the discrepancy between the transformed image and the corresponding ground-truth. Simultaneously, the image transformation network T is trained to minimize the discrepancy explored by the discriminative network D. Through the adversarial training process, the image transformation network T will continually narrow the gap between transformed images and ground-truth images. Experiments evaluated on several image-to-image transformation tasks (e.g., image de-raining, image inpainting, etc.) show that the proposed PAN outperforms many related state-of-the-art methods.

Abstract (translated)

在本文中,我们提出了一个原则性的知觉对抗网络(PAN),用于图像到图像的转换任务。与现有的特定于应用的算法不同,PAN提供了一个学习成对图像之间映射关系的通用框架(图1),例如将雨点图像映射到其去雨点对应物、将对象边缘映射到其照片、将语义标签映射到场景图像等。所提出的PAN由两个前馈卷积神经网络组成。(cnns)、图像转换网络t和识别网络d,通过将生成性对抗损失和提出的感知性对抗损失相结合,对这两个网络进行交替训练,以解决图像到图像的转换任务。其中,对识别网络D的隐藏层和输出进行了升级,以不断自动地发现变换后的图像与相应的地面真值之间的差异。同时,对图像变换网络T进行训练,使识别网络D所发现的差异最小化,通过对抗训练过程,图像变换网络T将不断缩小变换后的图像与地面真值图像之间的距离。对多个图像到图像转换任务(如图像去雨、图像修复等)进行的实验评估表明,所提出的PAN优于许多相关的最新方法。

URL

https://arxiv.org/abs/1706.09138

PDF

https://arxiv.org/pdf/1706.09138.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot