Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker Recognition Challenge 2021

2021-09-08 12:04:18

Li Zhang, Huan Zhao, Qinling Meng, Yanli Chen, Min Liu, Lei Xie

arXiv_SD

arXiv_SD Recognition Embedding

Abstract
Abstract (translated)
URL
PDF

Abstract

In this report, we describe the Beijing ZKJ-NPU team submission to the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC-21). We participated in the fully supervised speaker verification track 1 and track 2. In the challenge, we explored various kinds of advanced neural network structures with different pooling layers and objective loss functions. In addition, we introduced the ResNet-DTCF, CoAtNet and PyConv networks to advance the performance of CNN-based speaker embedding model. Moreover, we applied embedding normalization and score normalization at the evaluation stage. By fusing 11 and 14 systems, our final best performances (minDCF/EER) on the evaluation trails are 0.1205/2.8160% and 0.1175/2.8400% respectively for track 1 and 2. With our submission, we came to the second place in the challenge for both tracks.

Abstract (translated)

URL

https://arxiv.org/abs/2109.03568

PDF

https://arxiv.org/pdf/2109.03568.pdf