Abstract
BERTScore is an effective and robust automatic metric for referencebased machine translation evaluation. In this paper, we incorporate multilingual knowledge graph into BERTScore and propose a metric named KG-BERTScore, which linearly combines the results of BERTScore and bilingual named entity matching for reference-free machine translation evaluation. From the experimental results on WMT19 QE as a metric without references shared tasks, our metric KG-BERTScore gets higher overall correlation with human judgements than the current state-of-the-art metrics for reference-free machine translation evaluation.1 Moreover, the pre-trained multilingual model used by KG-BERTScore and the parameter for linear combination are also studied in this paper.
Abstract (translated)
BERTScore是一种有效的和稳定的自动度量工具,用于参考based机器翻译评估。在本文中,我们将多语言知识图引入BERTScore,并提出了名为KG-BERTScore的度量工具,该工具通过线性组合BERTScore和双语实体匹配结果,用于无参考共享任务的机器翻译评估。从WMT19 QE实验结果作为无参考共享任务度量的示例,我们的KG-BERTScore度量工具与人类判断的总体相关性比当前无参考机器翻译评估的最佳现代度量工具更高。此外,我们还研究了KG-BERTScore使用的预训练多语言模型和线性组合参数。
URL
https://arxiv.org/abs/2301.12699