6Single‐Channel Speech Presence Probability Estimation and Noise Tracking
Rainer Martin and Israel Cohen
The single‐channel enhancement filters reviewed in Chapter 5 require knowledge of the power spectra of the target and the interference signals. Since the target and interfering signals are not available their power spectra must be estimated from the mixture signal. In most acoustic scenarios the power spectra of both the target and the interfering signals are time‐varying and therefore require online tracking. All together, this constitutes a challenging estimation problem, especially when the interference is highly nonstationary and when it occupies the same frequency bands as the target signal.
Most algorithms in this domain rely on specific statistical differences between speech as the target signal and interfering noise signals. The methods presented in this chapter are developed for a single speaker mixed with short‐time stationary environmental noise such as car noise and multiple‐speaker babble noise. These methods will most likely fail when the interference is a single competing speaker. In the latter case, there are in general no speaker‐independent statistical differences that could be exploited. Single‐channel speaker separation methods must then be utilized which typically require trained models of specific speakers. These methods are outside the scope of this chapter and are discussed in Chapters 7, 8, and 9. Speech and noise power spectrum estimation is closely related ...
Get Audio Source Separation and Speech Enhancement now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.