Skip to Content
Audio Source Separation and Speech Enhancement
book

Audio Source Separation and Speech Enhancement

by Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
October 2018
Intermediate to advanced
504 pages
18h 50m
English
Wiley
Content preview from Audio Source Separation and Speech Enhancement

11Multichannel Parameter Estimation

Shmulik Markovich‐Golan Walter Kellermann and Sharon Gannot

In this chapter we will explore some widely used structures and estimation procedures for tracking the parameters that are required for constructing the data‐dependent spatial filters for audio signals that are discussed in Chapter 10. As before, the spatial filters are designed to extract a desired source contaminated by background noise, in the case of a single speaker, or by interfering speakers and background noise, in the multiple speakers case.

The spatial filters explored in Chapter 10 (mainly, those referred to as beamformers) assume that certain parameters are available for their computation, namely the relative transfer functions (RTFs) of the speakers, the covariance matrices of the background noise and the speakers, and/or the cross‐covariance between the mixture signals and the desired signal.

The variety of optimization criteria in Chapter 10 leads to different data‐dependent spatial filters, yet most of them rely on similar parameters and therefore a common estimation framework can be derived. In general, estimates of speech presence probability (SPP) are used to govern the estimation of noise and speech spatial covariance matrices. These estimates are then utilized to estimate source RTF vectors. Finally, the data‐dependent spatial filters are designed based on the latter estimates. A high‐level block diagram of the common estimation framework is depicted in Figure 11.1

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Techniques for Noise Robustness in Automatic Speech Recognition

Rita Singh, Tuomas Virtanen, Bhiksha Raj
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis

Publisher Resources

ISBN: 9781119279891Purchase book