Supervised Chorus Detection for Popular Music Using Convolutional Neural Network and Multi-task Learning

2021-03-26 04:32:08

Ju-Chiang Wang, Jordan B.L. Smith, Jitong Chen, Xuchen Song, Yuxuan Wang

arXiv_AI

arXiv_AI Segmentation CNN Detection Prediction Unsupervised Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

This paper presents a novel supervised approach to detecting the chorus segments in popular music. Traditional approaches to this task are mostly unsupervised, with pipelines designed to target some quality that is assumed to define "chorusness," which usually means seeking the loudest or most frequently repeated sections. We propose to use a convolutional neural network with a multi-task learning objective, which simultaneously fits two temporal activation curves: one indicating "chorusness" as a function of time, and the other the location of the boundaries. We also propose a post-processing method that jointly takes into account the chorus and boundary predictions to produce binary output. In experiments using three datasets, we compare our system to a set of public implementations of other segmentation and chorus-detection algorithms, and find our approach performs significantly better.

Abstract (translated)

URL

https://arxiv.org/abs/2103.14253

PDF

https://arxiv.org/pdf/2103.14253.pdf