Skip to Content
Audio Source Separation and Speech Enhancement
book

Audio Source Separation and Speech Enhancement

by Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
October 2018
Intermediate to advanced
504 pages
18h 50m
English
Wiley
Content preview from Audio Source Separation and Speech Enhancement

17Application of Source Separation to Robust Speech Analysis and Recognition

Shinji Watanabe Tuomas Virtanen and Dorothea Kolossa

This chapter describes applications of source separation techniques to robust speech analysis and recognition, including automatic speech recognition (ASR), speaker/language identification, emotion and paralinguistic analysis, and audiovisual analysis. These are the most successful applications in audio and speech processing, with various commercial products including Google Voice Search, Apple Siri, Amazon Echo, and Microsoft Cortana. Robustness against noise or nontarget speech still remains a challenging issue, and source separation and speech enhancement techniques are gathering much attention in the speech community.

This chapter systematically describes how source separation and speech enhancement techniques are applied to improve the robustness of these applications. It first describes the challenges and opportunities in Section 17.1, and defines the considered speech analysis and recognition applications with basic formulations in Section 17.2. Section 17.3 describes the current state‐of‐the‐art system using source separation as a front‐end method for speech analysis and recognition. Section 17.4 introduces a way of tightly integrating these methods by preserving the uncertainties between them. Section 17.5 provides another possible solution to the robustness issues with the help of cross‐modality information. Section 17.6 concludes the chapter. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Techniques for Noise Robustness in Automatic Speech Recognition

Rita Singh, Tuomas Virtanen, Bhiksha Raj
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis

Publisher Resources

ISBN: 9781119279891Purchase book