Paper Reading AI Learner

mu-Forcing: Training Variational Recurrent Autoencoders for Text Generation

2019-05-24 07:32:37
Dayiheng Liu, Xu Yang, Feng He, Yuanyuan Chen, Jiancheng Lv

Abstract

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problem. The model would collapse into a plain language model that totally ignore the latent variables and can only generate repeating and dull samples. In this paper, we explore the reason behind this issue and propose an effective regularizer based approach to address it. The proposed method directly injects extra constraints on the posteriors of latent variables into the learning process of VRAE, which can flexibly and stably control the trade-off between the KL term and the reconstruction term, making the model learn dense and meaningful latent representations. The experimental results show that the proposed method outperforms several strong baselines and can make the model learn interpretable latent variables and generate diverse meaningful sentences. Furthermore, the proposed method can perform well without using other strategies, such as KL annealing.

Abstract (translated)

以往的研究发现,训练变分循环自动编码器(VRAE)生成文本存在严重的非格式性潜在变量问题。该模型将折叠成一个纯语言模型,完全忽略潜在变量,只能生成重复的和单调的样本。在本文中,我们探讨了这一问题背后的原因,并提出了一种有效的基于正则化的方法来解决这一问题。该方法直接在VRAE的学习过程中对潜在变量的后验进行了额外的约束,能够灵活、稳定地控制KL项与重构项之间的权衡,使模型学习到密集而有意义的潜在表示。实验结果表明,该方法优于多个强基线,能使模型学习可解释的潜在变量,生成多种有意义的句子。此外,该方法可以在不使用其他策略的情况下,如KL退火,取得良好的效果。

URL

https://arxiv.org/abs/1905.10072

PDF

https://arxiv.org/pdf/1905.10072.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot