Skip to Content
Audio Source Separation and Speech Enhancement
book

Audio Source Separation and Speech Enhancement

by Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
October 2018
Intermediate to advanced
504 pages
18h 50m
English
Wiley
Content preview from Audio Source Separation and Speech Enhancement

4Multichannel Source Activity Detection, Localization, and Tracking

Pasi Pertilä Alessio Brutti Piergiorgio Svaizer and Maurizio Omologo

In the previous chapters, we have seen how both spectral and spatial properties of sound sources are relevant in describing an acoustic scene picked up by multiple microphones distributed in a real environment. This chapter now provides an introduction to the most common problems and methods related to source activity detection, localization, and tracking based on a multichannel acquisition setup. In Section 4.1 we start with a brief overview and with the definition of some basic notions, in particular related to time difference of arrival (TDOA) estimation and to the so‐called acoustic maps. In Section 4.2, the activity detection problem will be addressed, starting from an overview of the most common methods and concluding with recent trends. Section 4.3 will examine the localization problem for both static and moving sources, with some insights into the localization of multiple sources. Section 4.4 will conclude the chapter.

Note that, although activity detection and localization apply to any kind of audio source, often the signal of interest is speech. Due to its spectral and temporal peculiarities, speech calls for specific algorithmic solutions which typically do not generalize to other sources. As an example, speech is by definition a nonstationary process, characterized by both long and very short pauses. Therefore, especially in Section 4.2 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Techniques for Noise Robustness in Automatic Speech Recognition

Rita Singh, Tuomas Virtanen, Bhiksha Raj
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis

Publisher Resources

ISBN: 9781119279891Purchase book