Factorial Models for Noise Robust Speech Recognition

John R. Hershey1, Steven J. Rennie2, Jonathan Le Roux1

1Mitsubishi Electric Research Laboratories, USA 2IBM Thomas J. Watson Research Center, USA

12.1 Introduction

Noise compensation techniques for robust automatic speech recognition (ASR) attempt to improve system performance in the presence of acoustic interference. In feature-based noise compensation, which includes speech enhancement approaches, the acoustic features that are sent to the recognizer are first processed to remove the effects of noise (see Chapter 9). Model compensation approaches, in contrast, are concerned with modifying and even extending the acoustic model of speech to account for the effects of noise. A taxonomy of the different approaches to noise compensation is depicted in Figure 12.1, which serves as a road map for the present discussion.

Figure 12.1 Noise compensation methods in a Venn diagram. The shaded region represents model-based noise compensation, the subject of this chapter. Note that the term “model” in “model compensation” refers to the recognizer's acoustic model, whereas in “model-based noise compensation,” it refers to the models of additive noise.


The two main strategies used for model compensation approaches are model adaptation and model-based noise compensation. Model adaptation approaches implicitly account for noise by adjusting ...

Get Techniques for Noise Robustness in Automatic Speech Recognition now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.