A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

2021-10-08 17:21:21

Ge Zhu, Frank Cwitkowitz, Zhiyao Dua

arXiv_SD

arXiv_SD Embedding Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

In this paper, we conduct a cross-dataset study on parametric and non-parametric raw-waveform based speaker embeddings through speaker verification experiments. In general, we observe a more significant performance degradation of these raw-waveform systems compared to spectral based systems. We then propose two strategies to improve the performance of raw-waveform based systems on cross-dataset tests. The first strategy is to change the real-valued filters into analytic filters to ensure shift-invariance. The second strategy is to apply variational dropout to non-parametric filters to prevent them from overfitting irrelevant nuance features.

Abstract (translated)

URL

https://arxiv.org/abs/2110.04265

PDF

https://arxiv.org/pdf/2110.04265.pdf