Paper Reading AI Learner

Top-down Tree Structured Decoding with Syntactic Connections for Neural Machine Translation and Parsing

2018-09-06 07:33:48
Jetic Gū, Hassan S. Shavarani, Anoop Sarkar

Abstract

The addition of syntax-aware decoding in Neural Machine Translation (NMT) systems requires an effective tree-structured neural network, a syntax-aware attention model and a language generation model that is sensitive to sentence structure. We exploit a top-down tree-structured model called DRNN (Doubly-Recurrent Neural Networks) first proposed by Alvarez-Melis and Jaakola (2017) to create an NMT model called Seq2DRNN that combines a sequential encoder with tree-structured decoding augmented with a syntax-aware attention model. Unlike previous approaches to syntax-based NMT which use dependency parsing models our method uses constituency parsing which we argue provides useful information for translation. In addition, we use the syntactic structure of the sentence to add new connections to the tree-structured decoder neural network (Seq2DRNN+SynC). We compare our NMT model with sequential and state of the art syntax-based NMT models and show that our model produces more fluent translations with better reordering. Since our model is capable of doing translation and constituency parsing at the same time we also compare our parsing accuracy against other neural parsing models.

Abstract (translated)

在神经机器翻译(NMT)系统中添加语法感知解码需要有效的树形结构神经网络,语法感知注意模型和对句子结构敏感的语言生成模型。我们利用Alvarez-Melis和Jaakola(2017)首先提出的名为DRNN(双回归神经网络)的自上而下的树状结构模型来创建一个名为Seq2DRNN的NMT模型,该模型将顺序编码器与树结构解码相结合,增强了语法感知注意模型。与先前使用依赖性解析模型的基于语法的NMT的方法不同,我们的方法使用选区解析,我们认为它提供了有用的翻译信息。此外,我们使用句子的句法结构来添加到树形结构的解码器神经网络(Seq2DRNN + SynC)的新连接。我们将NMT模型与顺序和最先进的基于语法的NMT模型进行比较,并表明我们的模型可以生成更流畅的翻译,并具有更好的重新排序。由于我们的模型能够同时进行翻译和选区解析,因此我们还将解析精度与其他神经解析模型进行比较。

URL

https://arxiv.org/abs/1809.01854

PDF

https://arxiv.org/pdf/1809.01854.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot