Discrete representations in neural models of spoken language

2021-05-12 11:02:02

Bertrand Higy, Lieke Gelderloos, Afra Alishahi, Grzegorz Chrupała

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

The distributed and continuous representations used by neural networks are at odds with representations employed in linguistics, which are typically symbolic. Vector quantization has been proposed as a way to induce discrete neural representations that are closer in nature to their linguistic counterparts. However, it is not clear which metrics are the best-suited to analyze such discrete representations. We compare the merits of four commonly used metrics in the context of weakly supervised models of spoken language. We perform a systematic analysis of the impact of (i) architectural choices, (ii) the learning objective and training dataset, and (iii) the evaluation metric. We find that the different evaluation metrics can give inconsistent results. In particular, we find that the use of minimal pairs of phoneme triples as stimuli during evaluation disadvantages larger embeddings, unlike metrics applied to complete utterances.

Abstract (translated)

URL

https://arxiv.org/abs/2105.05582

PDF

https://arxiv.org/pdf/2105.05582.pdf