Paper Reading AI Learner

CoopInit: Initializing Generative Adversarial Networks via Cooperative Learning

2023-03-21 07:49:32
Yang Zhao, Jianwen Xie, Ping Li

Abstract

Numerous research efforts have been made to stabilize the training of the Generative Adversarial Networks (GANs), such as through regularization and architecture design. However, we identify the instability can also arise from the fragile balance at the early stage of adversarial learning. This paper proposes the CoopInit, a simple yet effective cooperative learning-based initialization strategy that can quickly learn a good starting point for GANs, with a very small computation overhead during training. The proposed algorithm consists of two learning stages: (i) Cooperative initialization stage: The discriminator of GAN is treated as an energy-based model (EBM) and is optimized via maximum likelihood estimation (MLE), with the help of the GAN's generator to provide synthetic data to approximate the learning gradients. The EBM also guides the MLE learning of the generator via MCMC teaching; (ii) Adversarial finalization stage: After a few iterations of initialization, the algorithm seamlessly transits to the regular mini-max adversarial training until convergence. The motivation is that the MLE-based initialization stage drives the model towards mode coverage, which is helpful in alleviating the issue of mode dropping during the adversarial learning stage. We demonstrate the effectiveness of the proposed approach on image generation and one-sided unpaired image-to-image translation tasks through extensive experiments.

Abstract (translated)

已经做了很多研究,旨在稳定生成对抗网络(GAN)的训练,比如通过Regularization和架构设计等方法。然而,我们发现GAN的训练不稳定可能是由于在对抗学习的早期阶段脆弱的平衡引起的。本文提出了CoopInit,一种简单但有效的合作初始化策略,可以在训练期间非常小的计算 overhead 的情况下快速学习GAN的一个好的起点,同时减少了训练期间的能耗。该算法由两个学习阶段组成:(i) 合作初始化阶段:GAN的判别器被视为基于能量模型(EBM)并通过最大似然估计(MLE)优化,利用GAN的生成器提供合成数据以近似学习梯度。EBM还通过MCMC teaching引导生成器进行MLE学习;(ii) 对抗最终化阶段:在初始化几步之后,算法无缝地过渡到 regular mini-max 对抗训练,直到收敛。动力是MLE初始化阶段推动模型进入模式覆盖,这有助于减轻在对抗学习阶段模式丢失的问题。我们通过广泛的实验证明了该方法在图像生成和两侧独立的图像到图像翻译任务中的 effectiveness。

URL

https://arxiv.org/abs/2303.11649

PDF

https://arxiv.org/pdf/2303.11649.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot