Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

2020-09-06 18:15:14

Archontis Politis, Annamaria Mesaros, Sharath Adavanne, Toni Heittola, Tuomas Virtanen

arXiv_SD

arXiv_SD GAN Detection Classification Knowledge Activity

Abstract
Abstract (translated)
URL
PDF

Abstract

Sound event localization and detection is a novel area of research that emerged from the combined interest of analyzing the acoustic scene in terms of the spatial and temporal activity of sounds of interest. This paper presents an overview of the first international evaluation on sound event localization and detection, organized as a task of DCASE 2019 Challenge. A large-scale realistic dataset of spatialized sound events was generated for the challenge, to be used for training of learning-based approaches, and for evaluation of the submissions in an unlabeled subset. %Additionally, a competent baseline was provided to the participants. The overview presents in detail how the systems were evaluated and ranked and the characteristics of the best-performing systems. Common strategies in terms of input features, model architectures, training approaches, exploitation of prior knowledge, and data augmentation are discussed. Since ranking in the challenge was based on individually evaluating localization and event classification performance, part of the overview focuses on presenting metrics for the joint measurement of the two, together with a re-evaluation of submissions using these new metrics. The analysis reveals submissions with balanced performance on classifying sounds correctly close to their original location, and systems being strong on one or both of the two tasks, but not jointly.

Abstract (translated)

URL

https://arxiv.org/abs/2009.02792

PDF

https://arxiv.org/pdf/2009.02792.pdf