声源处理:从分析到应用 (CS CompLang)

当前,大多数语音技术应用都依赖于表征声道响应的声学特征,例如广泛使用的LPC参数的MFCC。 尽管如此,通过声带的气流(称为声门气流)有望表现出相关的互补性。 不幸的是,来自语音记录的声门分析需要特定且更复杂的操作处理,这解释了为什么我们通常避免使用它。 这篇综述给出了针对声门源处理设计技术的一般概述。 从基音跟踪,声门闭合即时检测,声门流量估计和建模等基础分析工具开始,然后本文重点介绍如何在各种语音技术应用程序中正确集成这些解决方案。

原文题目:Glottal Source Processing: from Analysis to Applications

原文:The great majority of current voice technology applications relies on acoustic features characterizing the vocal tract response, such as the widely used MFCC of LPC parameters. Nonetheless, the airflow passing through the vocal folds, and called glottal flow, is expected to exhibit a relevant complementarity. Unfortunately, glottal analysis from speech recordings requires specific and more complex processing operations, which explains why it has been generally avoided. This review gives a general overview of techniques which have been designed for glottal source processing. Starting from fundamental analysis tools of pitch tracking, glottal closure instant detection, glottal flow estimation and modelling, this paper then highlights how these solutions can be properly integrated within various voice technology applications.

原文作者:Thomas Drugman,Paavo Alku,Abeer Alwan,Bayya Yegnanarayana

原文地址:https://arxiv.org/abs/1912.12604

声源处理:从分析到应用(CS CompLang).pdf