通過對抗式自動編碼器創造出一種具有風格意識的符號音樂的潛在空間(CS SD)

  • 2020 年 1 月 19 日
  • 筆記

我們解決了生成音樂建模中符號音樂數據有效潛空間的學習這一具有挑戰性的開放問題。我們的重點是利用對抗性規則作為一種靈活和自然的手段,將與音樂類型和風格相關的上下文資訊灌輸給不同的自動編碼器。通過介紹第一個音樂對抗式自動編碼器(MusAE),我們展示了如何利用考慮音樂元數據資訊的高斯混合作為自動編碼器潛在空間的有效先驗。在大型基準上的經驗分析表明,我們的模型比基於標準變分自編碼器的最先進模型具有更高的重建精度。它還能夠在兩個音樂序列之間創建真實的插值,平滑地改變不同音軌的動態。實驗表明,該模型可以根據樂曲的低層次屬性來組織其潛在空間,並將先驗分布中注入的高層次類型資訊嵌入潛在變數中,以提高整體性能。這允許我們以有原則的方式對生成的塊執行更改。

原文題目:Learning a Latent Space of Style-Aware Symbolic Music Representations by Adversarial Autoencoders

原文:We address the challenging open problem of learning an effective latent space for symbolic music data in generative music modeling. We focus on leveraging adversarial regularization as a flexible and natural mean to imbue variational autoencoders with context information concerning music genre and style. Through the paper, we show how Gaussian mixtures taking into account music metadata information can be used as an effective prior for the autoencoder latent space, introducing the first Music Adversarial Autoencoder (MusAE). The empirical analysis on a large scale benchmark shows that our model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders. It is also able to create realistic interpolations between two musical sequences, smoothly changing the dynamics of the different tracks. Experiments show that the model can organise its latent space accordingly to low-level properties of the musical pieces, as well as to embed into the latent variables the high-level genre information injected from the prior distribution to increase its overall performance. This allows us to perform changes to the generated pieces in a principled way.

原文作者:Andrea Valenti, Antonio Carta, Davide Bacciu

原文地址:https://arxiv.org/abs/2001.05494