Paper Reading AI Learner

A Comparative Study of LLMs, NMT Models, and Their Combination in Persian-English Idiom Translation

2024-12-13 09:29:27
Sara Rezaeimanesh, Faezeh Hosseini, Yadollah Yaghoobzadeh

Abstract

Large language models (LLMs) have shown superior capabilities in translating figurative language compared to neural machine translation (NMT) systems. However, the impact of different prompting methods and LLM-NMT combinations on idiom translation has yet to be thoroughly investigated. This paper introduces two parallel datasets of sentences containing idiomatic expressions for Persian$\rightarrow$English and English$\rightarrow$Persian translations, with Persian idioms sampled from our PersianIdioms resource, a collection of 2,200 idioms and their meanings. Using these datasets, we evaluate various open- and closed-source LLMs, NMT models, and their combinations. Translation quality is assessed through idiom translation accuracy and fluency. We also find that automatic evaluation methods like LLM-as-a-judge, BLEU and BERTScore are effective for comparing different aspects of model performance. Our experiments reveal that Claude-3.5-Sonnet delivers outstanding results in both translation directions. For English$\rightarrow$Persian, combining weaker LLMs with Google Translate improves results, while Persian$\rightarrow$English translations benefit from single prompts for simpler models and complex prompts for advanced ones.

Abstract (translated)

大型语言模型(LLMs)在翻译比喻性语言方面显示出优于神经机器翻译(NMT)系统的卓越能力。然而,不同的提示方法以及LLM-NMT组合对成语翻译的影响尚未得到彻底研究。本文引入了两个平行的数据集,其中包含波斯语→英语和英语→波斯语的含有成语表达的句子,波斯语成语从我们的PersianIdioms资源中采样,该资源包含了2,200个成语及其含义。利用这些数据集,我们评估了各种开源和闭源LLMs、NMT模型及它们的不同组合。翻译质量通过成语翻译准确性和流利度进行评估。我们还发现自动评估方法如以LLM作为评判者、BLEU 和 BERTScore 对比较不同方面的模型性能非常有效。我们的实验表明,Claude-3.5-Sonnet 在两个翻译方向上都取得了出色的结果。对于英语→波斯语的翻译,较弱的LLMs与Google Translate结合可以改善结果;而波斯语→英语的翻译则从简单模型的单个提示和先进模型的复杂提示中受益。

URL

https://arxiv.org/abs/2412.09993

PDF

https://arxiv.org/pdf/2412.09993.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot