樂譜和歌詞-自由歌唱的聲音產生(Sounds)
- 2020 年 1 月 3 日
- 筆記
歌聲的生成模型多關注「歌唱聲合成」的任務,即「歌聲合成」。比如,以產生唱歌的聲音波形給予樂譜和文字歌詞。在這項工作中,我們探索了一個新穎又富有挑戰性的選擇:歌唱聲在訓練和推斷的時間裡、在沒有預先指定的樂譜和歌詞的情況下生成。特別地,我們提出了三種非條件或弱條件的聲樂發聲方案。我們講述了相關的挑戰並提出了解決這些新任務的途徑。這包括開發用於數據準備的源分離和轉錄模型,用於音頻生成的對抗網路,以及用於評估的訂製指標。
原文標題:Sounds: Score and Lyrics-Free Singing Voice Generation
原文:Generative models for singing voice have been mostly concerned with the task of "singing voice synthesis," i.e., to produce singing voice waveforms given musical scores and text lyrics. In this work, we explore a novel yet challenging alternative: singing voice generation without pre-assigned scores and lyrics, in both training and inference time. In particular, we propose three either unconditioned or weakly conditioned singing voice generation schemes. We outline the associated challenges and propose a pipeline to tackle these new tasks. This involves the development of source separation and transcription models for data preparation, adversarial networks for audio generation, and customized metrics for evaluation.
原文作者:Jen-Yu Liu,Yu-Hua Chen,Yin-Cheng Yeh,Yi-Hsuan Yang
原文鏈接:https://arxiv.org/abs/1912.11747