BC-VAD: A Robust Bone Conduction Voice Activity Detection

2022-12-06 14:14:00

Niccolo' Polvani, Damien Ronssin, Milos Cernak

arXiv_SD

Abstract
Abstract (translated)
URL
PDF

Abstract

Voice Activity Detection (VAD) is a fundamental module in many audio applications. Recent state-of-the-art VAD systems are often based on neural networks, but they require a computational budget that usually exceeds the capabilities of a small battery-operated device when preserving the performance of larger models. In this work, we rely on the input from a bone conduction microphone (BCM) to design an efficient VAD (BC-VAD) robust against residual non-stationary noises originating from the environment or speakers not wearing the BCM.We first show that a larger VAD system (58k parameters) achieves state-of-the-art results on a publicly available benchmark but fails when running on bone conduction signals. We then compare its variant BC-VAD (5k parameters and trained on BC data) with a baseline especially designed for a BCM and show that the proposed method achieves better performances under various metrics while keeping the realtime processing requirement for a microcontroller.

Abstract (translated)

URL

https://arxiv.org/abs/2212.02996

PDF

https://arxiv.org/pdf/2212.02996.pdf