Abstract
Machine translation systems are conventionally trained on textual resources that do not model phenomena that occur in spoken language. While the evaluation of neural machine translation systems on textual inputs is actively researched in the literature , little has been discovered about the complexities of translating spoken language data with neural models. We introduce and motivate interesting problems one faces when considering the translation of automatic speech recognition (ASR) outputs on neural machine translation (NMT) systems. We test the robustness of sentence encoding approaches for NMT encoder-decoder modeling, focusing on word-based over byte-pair encoding. We compare the translation of utterances containing ASR errors in state-of-the-art NMT encoder-decoder systems against a strong phrase-based machine translation baseline in order to better understand which phenomena present in ASR outputs are better represented under the NMT framework than approaches that represent translation as a linear model.
Abstract (translated)
机器翻译系统通常是在不模拟口语中出现的现象的文本资源上进行培训的。虽然在文献中积极研究了神经机器翻译系统对文本输入的评价,但是很少有人发现用神经模型翻译口语数据的复杂性。我们介绍和激发一个有趣的问题时,考虑到翻译的自动语音识别(ASR)输出的神经机器翻译(NMT)系统。我们测试了用于NMT编码器-解码器建模的句子编码方法的鲁棒性,重点研究了基于字的字节对编码。我们将最先进的NMT编码器解码器系统中包含ASR错误的话语翻译与基于短语的机器翻译基线进行比较,以便更好地理解在NMT框架下,ASR输出中出现的现象比以线性模型表示翻译的方法更好地表示。
URL
https://arxiv.org/abs/1904.10997