Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction

2018-08-31 21:20:00

Yova Kementchedjhieva, Sebastian Ruder, Ryan Cotterell, Anders Søgaard

arXiv_CL

arXiv_CL OCR

Abstract
Abstract (translated)
URL
PDF

Abstract

Most recent approaches to bilingual dictionary induction find a linear alignment between the word vector spaces of two languages. We show that projecting the two languages onto a third, latent space, rather than directly onto each other, while equivalent in terms of expressivity, makes it easier to learn approximate alignments. Our modified approach also allows for supporting languages to be included in the alignment process, to obtain an even better performance in low resource settings.

Abstract (translated)

最近的双语词典归纳方法在两种语言的词向量空间之间找到线性对齐。我们展示了将两种语言投影到第三个潜在空间，而不是直接相互投射，而在表达性方面相当，这使得更容易学习近似对齐。我们的修改方法还允许支持语言包含在对齐过程中，以在低资源设置中获得更好的性能。

URL

https://arxiv.org/abs/1809.00064

PDF

https://arxiv.org/pdf/1809.00064.pdf