Skip to Content
Audio Source Separation and Speech Enhancement
book

Audio Source Separation and Speech Enhancement

by Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
October 2018
Intermediate to advanced
504 pages
18h 50m
English
Wiley
Content preview from Audio Source Separation and Speech Enhancement

7Single‐Channel Classification and Clustering Approaches

Felix Weninger Jun Du Erik Marchi and Tian Gao

The separation of sources from single‐channel mixtures is particularly challenging. If two or more microphones are available, information on relative amplitudes or relative time delays can be used to identify the sources and help to perform the separation (see Chapter 12). Yet, with only one microphone, this information is not available. Instead, information about the structure of the source signals must be exploited to identify and separate the different components.

Methods for single‐channel source separation can be roughly grouped into two categories: clustering and classification/regression. Clustering algorithms are based on grouping similar time‐frequency bins. This particularly includes computational auditory scene analysis (CASA) approaches, which rely on psychoacoustic cues in a learning‐free mode, i.e. no models of individual sources are assumed, but rather generic properties of acoustic signals are exploited. In contrast, classification and regression algorithms are used in separation‐based training to predict the source belonging to the target class or classify the type of source that dominates each time‐frequency bin. Factorial hidden Markov models (HMMs) are a generative model explaining the statistics of a mixture based on statistical models of individual source signals, and hence rely on source‐based unsupervised training, i.e. training a model for each source from ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Techniques for Noise Robustness in Automatic Speech Recognition

Rita Singh, Tuomas Virtanen, Bhiksha Raj
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis

Publisher Resources

ISBN: 9781119279891Purchase book