即時檢測:聲門從語音信號的關閉和打開(Sound)

本文提出了一種直接從語音波形中檢測聲門閉合和打開瞬間(GCIs和GOIs)的新方法。這個過程分為兩個連續的步驟。首先計算一個基於均值的信號,並從中提取語音事件發生的時間間隔。其次,通過在線性預測殘差中定位一個不連續點來確定語音事件的精確位置。將該方法與基於CMU ARCTIC數據庫的DYPSA算法進行了比較。一個顯著的改善以及更好的噪聲穩健性在此方法中被報道。此外,GOI識別的準確性對於聲門的來源性描述是有保證的。

原文標題:Sound: Glottal Closure and Opening Instant Detection from Speech Signals

This paper proposes a new procedure to detect Glottal Closure and Opening Instants (GCIs and GOIs) directly from speech waveforms. The procedure is divided into two successive steps. First a mean-based signal is computed, and intervals where speech events are expected to occur are extracted from it. Secondly, at each interval a precise position of the speech event is assigned by locating a discontinuity in the Linear Prediction residual. The proposed method is compared to the DYPSA algorithm on the CMU ARCTIC database. A significant improvement as well as a better noise robustness are reported. Besides, results of GOI identification accuracy are promising for the glottal source characterization.

原文作者:Thomas Drugman,Thierry Dutoit

原文鏈接:https://arxiv.org/abs/2001.00841