Skip to Content
Audio Source Separation and Speech Enhancement
book

Audio Source Separation and Speech Enhancement

by Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
October 2018
Intermediate to advanced
504 pages
18h 50m
English
Wiley
Content preview from Audio Source Separation and Speech Enhancement

14Gaussian Model Based Multichannel Separation

Alexey Ozerov and Hirokazu Kameoka

The Gaussian framework for multichannel source separation consists of modeling vectors of STFT coefficients as multivariate complex Gaussian distributions. It allows specifying spatial and spectral models of the source spatial images and estimating their parameters in a joint manner. Multichannel nonnegative matrix factorization, illustrated in Figure 14.1, is one of the most popular such methods. It combines nonnegative matrix factorization (NMF) (see Chapter 8) and narrowband spatial modeling (see Chapter 3). Besides NMF, the Gaussian framework makes it possible to reuse many other single‐channel spectral models in a multichannel scenario. It differs from the frameworks in Chapters 11, 12, and 13 in the fact that more advanced generative spectral models are typically used. Also, according to the general taxonomies introduced in Chapter 1, it covers a wide range of audio source separation scenarios, including over‐ or underdetermined mixtures and weakly or strongly guided separation, and a wide range of methods that are either learning‐free or based on unsupervised/supervised source modeling.

In Section 14.1 we introduce the multichannel Gaussian framework. In Section 14.2 we provide a detailed list of spectral and spatial models. We explain how to estimate the parameters of these models in Section 14.3. We give a detailed presentation of a few methods in Section 14.4 and provide a summary in Section ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Techniques for Noise Robustness in Automatic Speech Recognition

Rita Singh, Tuomas Virtanen, Bhiksha Raj
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis

Publisher Resources

ISBN: 9781119279891Purchase book