Emmanuel Vincent Sharon Gannot and Tuomas Virtanen

Source separation and speech enhancement are core problems in the field of audio signal processing, with applications to speech, music, and environmental audio. Research in this field has accompanied technological trends, such as the move from landline to mobile or hands‐free phones, the gradual replacement of stereo by 3D audio, and the emergence of connected devices equipped with one or more microphones that can execute audio processing tasks which were previously regarded as impossible. In this short introductory chapter, after a brief discussion of the application needs in Section 1.1, we define the problems of source separation and speech enhancement and introduce relevant terminology regarding the scenarios and the desired outcome in Section 1.2. We then present the general processing scheme followed by most source separation and speech enhancement approaches and categorize these approaches in Section 1.3. Finally, we provide an outline of the book in Section 1.4.

1.1 Why are Source Separation and Speech Enhancement Needed?

The problems of source separation and speech enhancement arise from several application needs in the context of speech, music, and environmental audio processing.

Real‐world speech signals are often contaminated by interfering speakers, environmental noise, and/or reverberation. These phenomena deteriorate speech quality and, in adverse scenarios, speech intelligibility and automatic speech recognition ...

Get Audio Source Separation and Speech Enhancement now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.