Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data

2022-04-25 08:30:26

Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo, Qingyang Hong, Lin Li

arXiv_SD

arXiv_SD CNN Recognition Attention Embedding Unsupervised Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

Unsupervised clustering on speakers is becoming increasingly important for its potential uses in semi-supervised learning. In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions. An effective unsupervised clustering approach would allow us to significantly increase the amount of training data without additional costs for annotations. Recently, methods based on graph convolutional networks (GCN) have received growing attention for unsupervised clustering, as these methods exploit the connectivity patterns between nodes to improve learning performance. In this work, we present a GCN-based approach for semi-supervised learning. Given a pre-trained embedding extractor, a graph convolutional network is trained on the labeled data and clusters unlabeled data with "pseudo-labels". We present a self-correcting training mechanism that iteratively runs the cluster-train-correct process on pseudo-labels. We show that this proposed approach effectively uses unlabeled data and improves speaker recognition accuracy.

Abstract (translated)

URL

https://arxiv.org/abs/2204.11501

PDF

https://arxiv.org/pdf/2204.11501.pdf