book

Robust Automatic Speech Recognition

by Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

October 2015

Intermediate to advanced

306 pages

10h 38m

English

Academic Press

Read now

Unlock full access

Abstract1.1 Automatic Speech Recognition1.2 Robustness to Noisy Environments1.3 Existing Surveys in the Area1.4 Book Structure Overview

Abstract2.1 Introduction: Components of Speech Recognition2.2 Gaussian Mixture Models2.3 Hidden Markov Models and the Variants2.4 Deep Learning and Deep Neural Networks2.5 Summary
Abstract3.1 Standard Evaluation Databases3.2 Modeling Distortions of Speech in Acoustic Environments3.3 Impact of Acoustic Distortion on Gaussian Modeling3.4 Impact of Acoustic Distortion on DNN Modeling3.5 A General Framework for Robust Speech Recognition3.6 Categorizing Robust ASR Techniques: An Overview3.7 Summary
Abstract4.1 Feature-Space Approaches4.2 Model-Space Approaches4.3 Summary
Abstract5.1 Learning from Stereo Data5.2 Learning from Multi-Environment Data5.3 Summary
Abstract6.1 Parallel Model Combination6.2 Vector Taylor Series6.3 Sampling-Based Methods6.4 Acoustic Factorization6.5 Summary
Abstract7.1 Model-Domain Uncertainty7.2 Feature-Domain Uncertainty7.3 Joint Uncertainty Decoding7.4 Missing-Feature Approaches7.5 Summary
Abstract8.1 Speaker Adaptive and Source Normalization Training8.2 Model Space Noise Adaptive Training8.3 Joint Training for DNN8.4 Summary
Abstract9.1 Introduction9.2 Acoustic Impulse Response9.3 A Model of Reverberated Speech in Different Domains9.4 The Effect of Reverberation on ASR Performance9.5 Linear Filtering Approaches9.6 Magnitude or Power Spectrum Enhancement9.7 Feature Domain Approaches9.8 Acoustic Model Domain Approaches9.9 The REVERB Challenge9.10 To Probe Further9.11 Summary
Abstract10.1 Introduction10.2 The Acoustic Beamforming Problem10.3 Fundamentals of Data-Dependent Beamforming10.4 Multi-Channel Speech Recognition10.5 To Probe Further10.6 Summary
Abstract11.1 Robust Methods in the Era of GMM11.2 Robust Methods in the Era of DNN11.3 Multi-Channel Input and Robustness to Reverberation11.4 Epilogue

Content preview from Robust Automatic Speech Recognition

Chapter 10

Multi-channel processing

Abstract

Almost all noise-robust ASR techniques discussed so far in the book have assumed the use of a single microphone device that captures distorted speech signals. We devote this chapter to the techniques developed with multiple devices. Due to the availability of cheap hardware, an increasing portion of devices features multiple sound capturing channels. This adds the spatial dimension to the otherwise only spectro-temporal processing of single-microphone systems. If the target speech and the interferers are spatially separated, beamforming and other multi-channel processing can greatly improve the target signal-to-noise ratio. It is particularly advantageous in the presence of nonstationary interferers ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Techniques for Noise Robustness in Automatic Speech Recognition

Cognitive Virtual Assistants Using Google Dialogflow: Develop Complex Cognitive Bots Using the Google Dialogflow Platform

Navin Sabharwal, Amit Agrawal

Publisher Resources

ISBN: 9780128026168

Robust Automatic Speech Recognition

by Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Multi-channel processing

Abstract

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Intelligent Speech Signal Processing

Audio Source Separation and Speech Enhancement

Cognitive Virtual Assistants Using Google Dialogflow: Develop Complex Cognitive Bots Using the Google Dialogflow Platform

Publisher Resources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,and much more.

You might also like

Techniques for Noise Robustness in Automatic Speech Recognition

Intelligent Speech Signal Processing

Audio Source Separation and Speech Enhancement

Cognitive Virtual Assistants Using Google Dialogflow: Develop Complex Cognitive Bots Using the Google Dialogflow Platform

Publisher Resources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.