Uncertainty Decoding

Hank Liao

Google Inc., USA

One may view the accuracy degrading effects of noise in an automatic speech recognition system as increasing uncertainty while decoding the speech. To mitigate this, the statistical models used for recognition can be updated to reflect the error or uncertainty introduced by noise in the test environment. The greater the difference between the test and training and conditions, the greater the uncertainty. Some approaches that are motivated by this idea are presented in this chapter and are often described under the broad category called uncertainty decoding. Previous chapters have discussed methods to address environmental noise by using speech enhancement (Chapter 9), affine transformations of the features or model parameters (Chapter 11), or updating the acoustic model parameters (Chapter 12). This chapter discusses how these standard techniques relate to uncertainty decoding, demonstrates how they can be extended to handle uncertainty due to noise, and presents the strengths and weaknesses of various uncertainty decoding forms for noise robust speech recognition.

17.1 Introduction

The problem of speech recognition in noise results from mismatched training and test conditions. Acoustic noise in testing or actual usage conditions that is unaccounted for in training is unexpected and degrades recognition performance. Feature-based approaches to noise robustness, such as those presented in Chapter 5 or 9, remove the noise from the ...

Get Techniques for Noise Robustness in Automatic Speech Recognition now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.