使用音高同步残留代码簿进行混合HMM /帧选择的语音合成 (CS Sound)
- 2020 年 1 月 2 日
- 筆記
本文提出了一种对统计参量语音合成器提高质量的方法。 为此,我们使用音高同步残差帧的代码本,以构造更真实的源信号。 首先,从一些训练数据库中建立常见刺激的有限代码本。 在合成部分中,HMM用于生成滤波器系数和源系数。 后面的系数既包含音调又包含目标残差帧的紧凑表示。通过从码本中拾取刺激帧进行级联获得源信号,该拾取基于一定的选择标准并将目标残差系数作为输入。 与基本技术相比,客观结果显示出本技术相对的提高。
原文题目:Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis
原文:This paper proposes a method to improve the quality delivered by statistical parametric speech synthesizers. For this, we use a codebook of pitch-synchronous residual frames, so as to construct a more realistic source signal. First a limited codebook of typical excitations is built from some training database. During the synthesis part, HMMs are used to generate filter and source coefficients. The latter coefficients contain both the pitch and a compact representation of target residual frames. The source signal is obtained by concatenating excitation frames picked up from the codebook, based on a selection criterion and taking target residual coefficients as input. Subjective results show a relevant improvement compared to the basic technique.
原文作者:Thomas Drugman,Alexis Moinet,Thierry Dutoit,Geoffrey Wilfart
原文地址:https://arxiv.org/abs/1912.12887
使用音高同步残留代码簿进行混合HMM /帧选择的语音合成(CS Sound).pdf