Paper Reading AI Learner

Copy mechanism and tailored training for character-based data-to-text generation

2019-04-26 13:33:56
Marco Roberti, Giovanni Bonetta, Rossella Cancelliere, Patrick Gallinari

Abstract

In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation. The most widely used sequence-to-sequence neural methods are word-based: as such, they need a pre-processing step called delexicalization (conversely, relexicalization) to deal with uncommon or unknown words. These forms of processing, however, give rise to models that depend on the vocabulary used and are not completely neural. In this work, we present an end-to-end sequence-to-sequence model with attention mechanism which reads and generates at a character level, no longer requiring delexicalization, tokenization, nor even lowercasing. Moreover, since characters constitute the common "building blocks" of every text, it also allows a more general approach to text generation, enabling the possibility to exploit transfer learning for training. These skills are obtained thanks to two major features: (i) the possibility to alternate between the standard generation mechanism and a copy one, which allows to directly copy input facts to produce outputs, and (ii) the use of an original training pipeline that further improves the quality of the generated texts. We also introduce a new dataset called E2E+, designed to highlight the copying capabilities of character-based models, that is a modified version of the well-known E2E dataset used in the E2E Challenge. We tested our model according to five broadly accepted metrics (including the widely used bleu), showing that it yields competitive performance with respect to both character-based and word-based approaches.

Abstract (translated)

在过去的几年里,许多不同的方法都集中在使用深层次递归神经网络进行自然语言生成上。最广泛使用的序列排序神经方法是基于单词的:因此,它们需要一个叫做去毒性(相反,去毒性)的预处理步骤来处理不常见或未知的单词。然而,这些形式的处理产生的模型依赖于所用的词汇表,而不是完全神经的。在这项工作中,我们提出了一个具有注意机制的端到端序列到序列模型,该模型在字符级别读取和生成,不再需要去毒性、标记化技术,甚至不需要低成本。此外,由于字符构成了每个文本的通用“积木”,因此它还允许使用更通用的文本生成方法,从而可以利用转移学习进行培训。这些技能的获得得益于两个主要特征:(i)在标准生成机制和副本生成机制之间进行交替的可能性,这允许直接复制输入事实以生成输出;(ii)使用原始的培训管道,进一步提高生成文本的质量。我们还介绍了一个名为e2e+的新数据集,旨在突出基于字符的模型的复制功能,这是在e2e挑战中使用的著名e2e数据集的修改版本。我们根据五个广为接受的指标(包括广泛使用的BLEU)测试了我们的模型,表明它在基于字符和基于字的方法方面都具有竞争力。

URL

https://arxiv.org/abs/1904.11838

PDF

https://arxiv.org/pdf/1904.11838.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot