Skip to Content
Techniques for Noise Robustness in Automatic Speech Recognition
book

Techniques for Noise Robustness in Automatic Speech Recognition

by Rita Singh, Tuomas Virtanen, Bhiksha Raj
November 2012
Intermediate to advanced
514 pages
17h 40m
English
Wiley
Content preview from Techniques for Noise Robustness in Automatic Speech Recognition

17

Uncertainty Decoding

Hank Liao

Google Inc., USA

One may view the accuracy degrading effects of noise in an automatic speech recognition system as increasing uncertainty while decoding the speech. To mitigate this, the statistical models used for recognition can be updated to reflect the error or uncertainty introduced by noise in the test environment. The greater the difference between the test and training and conditions, the greater the uncertainty. Some approaches that are motivated by this idea are presented in this chapter and are often described under the broad category called uncertainty decoding. Previous chapters have discussed methods to address environmental noise by using speech enhancement (Chapter 9), affine transformations of the features or model parameters (Chapter 11), or updating the acoustic model parameters (Chapter 12). This chapter discusses how these standard techniques relate to uncertainty decoding, demonstrates how they can be extended to handle uncertainty due to noise, and presents the strengths and weaknesses of various uncertainty decoding forms for noise robust speech recognition.

17.1 Introduction

The problem of speech recognition in noise results from mismatched training and test conditions. Acoustic noise in testing or actual usage conditions that is unaccounted for in training is unexpected and degrades recognition performance. Feature-based approaches to noise robustness, such as those presented in Chapter 5 or 9, remove the noise from the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis
Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Publisher Resources

ISBN: 9781118392669Purchase book