3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

Abstract
Abstract (translated)
URL
PDF

Abstract

Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated that the visual information provided by existing MMT datasets is insufficient, causing models to disregard it and overestimate their capabilities. This issue presents a significant obstacle to the development of MMT research. This paper presents a novel solution to this issue by introducing 3AM, an ambiguity-aware MMT dataset comprising 26,000 parallel sentence pairs in English and Chinese, each with corresponding images. Our dataset is specifically designed to include more ambiguity and a greater variety of both captions and images than other MMT datasets. We utilize a word sense disambiguation model to select ambiguous data from vision-and-language datasets, resulting in a more challenging dataset. We further benchmark several state-of-the-art MMT models on our proposed dataset. Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets. Our work provides a valuable resource for researchers in the field of multimodal learning and encourages further exploration in this area. The data, code and scripts are freely available at this https URL.

Abstract (translated)

多模态机器翻译（MMT）是一项具有挑战性的任务，旨在通过融入视觉信息来提高翻译质量。然而，最近的研究表明，现有MMT数据集中的视觉信息不足以改善模型的性能，导致模型忽视它并夸大其能力。这个问题对MMT研究的未来发展构成了严重的障碍。本文通过引入3AM数据集，为解决这个问题提供了一种新的解决方案。3AM数据集是一个包含26,000个并行句子对的英语和中文的多模态数据集，每个句子对都配有相应的图像。我们的数据集特意设计为包括更多的歧义和更多不同种类的图像，与其他MMT数据集相比具有更大的差异。我们利用语义距离模型从视觉和语言数据集中选择歧义数据，从而形成了一个更具挑战性的数据集。我们还对我们的数据集上的一些最先进的MMT模型进行了实验比较。实验结果表明，在我们提出的数据集上训练的MMT模型比那些在其他MMT数据集上训练的模型具有更强的利用视觉信息的能力。我们的工作为该领域的研究人员提供了一个宝贵的资源，并鼓励在这个领域进行进一步的探索。数据、代码和脚本可免费在https://这个网址上获取。

URL

https://arxiv.org/abs/2404.18413

PDF

https://arxiv.org/pdf/2404.18413.pdf

3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

Abstract

Abstract (translated)

URL

PDF Copy

PDF