Abstract
This paper introduces the acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task, and evaluates the performance of a baseline system in the task. As in previous years of the challenge, the task is defined for classification of short audio samples into one of predefined acoustic scene classes, using a supervised, closed-set classification setup. The newly recorded TUT Urban Acoustic Scenes 2018 dataset consists of ten different acoustic scenes and was recorded in six large European cities, therefore it has a higher acoustic variability than the previous datasets used for this task, and in addition to high-quality binaural recordings, it also includes data recorded with mobile devices. We also present the baseline system consisting of a convolutional neural network and its performance in the subtasks using the recommended cross-validation setup.
Abstract (translated)
本文介绍了DCASE 2018 Challenge的声场分类任务和为任务提供的TUT Urban Acoustic Scenes 2018数据集,并评估了任务中基线系统的性能。与前几年的挑战一样,任务被定义为使用监督的闭集分类设置将短音频样本分类为预定义的声学场景类之一。新录制的TUT Urban Acoustic Scenes 2018数据集由十个不同的声学场景组成,并记录在六个欧洲大城市中,因此它具有比以前用于此任务的数据集更高的声学可变性,并且除了高质量的双耳录音外,它还包括用移动设备记录的数据。我们还使用推荐的交叉验证设置提出了基于卷积神经网络的基线系统及其在子任务中的性能。
URL
https://arxiv.org/abs/1807.09840