Deep Music Analogy Via Latent Representation Disentanglement

Abstract
Abstract (translated)
URL
PDF

Abstract

Analogy is a key solution to automated music generation, featured by its ability to generate both natural and creative pieces based on only a few examples. In general, an analogy is made by partially transferring the music abstractions, i.e., high-level representations and their relationships, from one piece to another; however, this procedure requires disentangling music representations, which takes little effort for musicians but is non-trivial for computers. Three sub-problems arise: extracting latent representations from the observation, disentangling the representations so that each part has a unique semantic interpretation, and mapping the latent representations back to actual music. An explicitly-constrained conditional variational auto-encoder (EC2-VAE) is proposed as a unified solution to all three sub-problems. In this study, we focus on disentangling the pitch and rhythm representations of 8-beat music clips conditioned on chords. In producing music analogies, this model helps us to realize the imaginary situation of "what if" a piece is composed using a different pitch contour, rhythm pattern, chord progression etc., by borrowing the representations from other pieces. Finally, we validate the proposed disentanglement method using objective measurements and evaluate the analogy examples by a subjective study.

Abstract (translated)

类比法是自动音乐生成的一个关键解决方案，其特点是它能够基于几个例子生成自然的和创造性的作品。一般来说，通过将音乐抽象部分地从一个作品转移到另一个作品，即高层次的表现和它们之间的关系来进行类比；然而，这个过程需要分离音乐表现，这对音乐家来说几乎不费吹灰之力，但对计算机来说却是不平凡的。出现了三个次级问题：从观察中提取潜在的表征，分离这些表征，使每个部分都有一个独特的语义解释，并将潜在的表征映射回实际的音乐。针对这三个子问题，提出了一种显式约束条件变分自动编码器（EC2-VAE）。在这项研究中，我们着重于分离8拍音乐剪辑在和弦条件下的音高和节奏表现。在创作音乐类比的过程中，该模型通过借鉴其他作品的表现形式，帮助我们认识到用不同的音高轮廓、节奏模式、和弦行进等来创作一首作品的假想情景。最后，通过客观测量验证了所提出的解散度方法，并通过主观研究对类比实例进行了评价。

URL

https://arxiv.org/abs/1906.03626

PDF

https://arxiv.org/pdf/1906.03626.pdf