聲源處理:從分析到應用 (CS CompLang)

當前,大多數語音技術應用都依賴於表徵聲道響應的聲學特徵,例如廣泛使用的LPC參數的MFCC。 儘管如此,通過聲帶的氣流(稱為聲門氣流)有望表現出相關的互補性。 不幸的是,來自語音記錄的聲門分析需要特定且更複雜的操作處理,這解釋了為什麼我們通常避免使用它。 這篇綜述給出了針對聲門源處理設計技術的一般概述。 從基音跟蹤,聲門閉合即時檢測,聲門流量估計和建模等基礎分析工具開始,然後本文重點介紹如何在各種語音技術應用程式中正確集成這些解決方案。

原文題目:Glottal Source Processing: from Analysis to Applications

原文:The great majority of current voice technology applications relies on acoustic features characterizing the vocal tract response, such as the widely used MFCC of LPC parameters. Nonetheless, the airflow passing through the vocal folds, and called glottal flow, is expected to exhibit a relevant complementarity. Unfortunately, glottal analysis from speech recordings requires specific and more complex processing operations, which explains why it has been generally avoided. This review gives a general overview of techniques which have been designed for glottal source processing. Starting from fundamental analysis tools of pitch tracking, glottal closure instant detection, glottal flow estimation and modelling, this paper then highlights how these solutions can be properly integrated within various voice technology applications.

原文作者:Thomas Drugman,Paavo Alku,Abeer Alwan,Bayya Yegnanarayana

原文地址:https://arxiv.org/abs/1912.12604

聲源處理:從分析到應用(CS CompLang).pdf