通过对抗式自动编码器创造出一种具有风格意识的符号音乐的潜在空间(CS SD)

  • 2020 年 1 月 19 日
  • 筆記

我们解决了生成音乐建模中符号音乐数据有效潜空间的学习这一具有挑战性的开放问题。我们的重点是利用对抗性规则作为一种灵活和自然的手段,将与音乐类型和风格相关的上下文信息灌输给不同的自动编码器。通过介绍第一个音乐对抗式自动编码器(MusAE),我们展示了如何利用考虑音乐元数据信息的高斯混合作为自动编码器潜在空间的有效先验。在大型基准上的经验分析表明,我们的模型比基于标准变分自编码器的最先进模型具有更高的重建精度。它还能够在两个音乐序列之间创建真实的插值,平滑地改变不同音轨的动态。实验表明,该模型可以根据乐曲的低层次属性来组织其潜在空间,并将先验分布中注入的高层次类型信息嵌入潜在变量中,以提高整体性能。这允许我们以有原则的方式对生成的块执行更改。

原文题目:Learning a Latent Space of Style-Aware Symbolic Music Representations by Adversarial Autoencoders

原文:We address the challenging open problem of learning an effective latent space for symbolic music data in generative music modeling. We focus on leveraging adversarial regularization as a flexible and natural mean to imbue variational autoencoders with context information concerning music genre and style. Through the paper, we show how Gaussian mixtures taking into account music metadata information can be used as an effective prior for the autoencoder latent space, introducing the first Music Adversarial Autoencoder (MusAE). The empirical analysis on a large scale benchmark shows that our model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders. It is also able to create realistic interpolations between two musical sequences, smoothly changing the dynamics of the different tracks. Experiments show that the model can organise its latent space accordingly to low-level properties of the musical pieces, as well as to embed into the latent variables the high-level genre information injected from the prior distribution to increase its overall performance. This allows us to perform changes to the generated pieces in a principled way.

原文作者:Andrea Valenti, Antonio Carta, Davide Bacciu

原文地址:https://arxiv.org/abs/2001.05494