Paper Reading AI Learner

Evaluation of NMT-Assisted Grammar Transfer for a Multi-Language Configurable Data-to-Text System

2025-01-27 15:25:26
Andreas Madsack, Johanna Heininger, Adela Schneider, Ching-Yi Chen, Christian Eckard, Robert Wei{\ss}graeber

Abstract

One approach for multilingual data-to-text generation is to translate grammatical configurations upfront from the source language into each target language. These configurations are then used by a surface realizer and in document planning stages to generate output. In this paper, we describe a rule-based NLG implementation of this approach where the configuration is translated by Neural Machine Translation (NMT) combined with a one-time human review, and introduce a cross-language grammar dependency model to create a multilingual NLG system that generates text from the source data, scaling the generation phase without a human in the loop. Additionally, we introduce a method for human post-editing evaluation on the automatically translated text. Our evaluation on the SportSett:Basketball dataset shows that our NLG system performs well, underlining its grammatical correctness in translation tasks.

Abstract (translated)

一种多语言数据到文本生成的方法是提前将源语言的语法结构翻译成每个目标语言。这些配置随后由表层实现器和文档规划阶段使用,以生成输出内容。在本文中,我们描述了一种基于规则的NLG(自然语言生成)实现方法,在该方法中通过神经机器翻译(NMT)结合一次性人工审查来翻译配置,并引入跨语言语法依赖模型,以此创建一个多语言NLG系统,可从源数据生成文本,并且无需人工干预即可扩展生成阶段。此外,我们还介绍了一种对自动翻译后的文本进行人工后期编辑评估的方法。在SportSett:Basketball数据集上的评估表明,我们的NLG系统表现良好,突出了其在翻译任务中的语法正确性。

URL

https://arxiv.org/abs/2501.16135

PDF

https://arxiv.org/pdf/2501.16135.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot