Skip to Content
Techniques for Noise Robustness in Automatic Speech Recognition
book

Techniques for Noise Robustness in Automatic Speech Recognition

by Rita Singh, Tuomas Virtanen, Bhiksha Raj
November 2012
Intermediate to advanced
514 pages
17h 40m
English
Wiley
Content preview from Techniques for Noise Robustness in Automatic Speech Recognition

13

Acoustic Model Training for Robust Speech Recognition

Michael L. Seltzer

Microsoft Research, USA

13.1 Introduction

Traditionally, researchers working on the field of noise robustness have focused their efforts on two areas: front-end enhancement and model compensation. Front-end enhancement encompasses a variety of signal and feature processing methods, such as those discussed in Chapters 4 and 9, that are designed to remove distortions in the speech caused by the acoustic environment [10,30,37]. On the other hand, model compensation, described in Chapters 11 and 12, alters the parameters of the speech recognizer's acoustic models to better match the characteristics of the current environment [13,17,32]. There is a rich literature in both of these areas that has led to improvements in speech-recognition performance over the years [14].

While all of this effort is focused on noise compensation at runtime, relatively little attention has been paid to the manner in which the speech-recognition systems are trained. Almost all of the robustness algorithms assume, either implicitly or explicitly, that the recognizer has been trained from clean speech, and the job of a noise-robustness technique is to reduce the mismatch between the clean acoustic models and the noisy speech. As a result, performance is determined by how well the captured speech is denoised or how well the clean acoustic models adapt to the environment of the test utterance. However, there are many reasons why this ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis
Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Publisher Resources

ISBN: 9781118392669Purchase book