Skip to Content
Techniques for Noise Robustness in Automatic Speech Recognition
book

Techniques for Noise Robustness in Automatic Speech Recognition

by Rita Singh, Tuomas Virtanen, Bhiksha Raj
November 2012
Intermediate to advanced
514 pages
17h 40m
English
Wiley
Content preview from Techniques for Noise Robustness in Automatic Speech Recognition

11

Adaptation and Discriminative Training of Acoustic Models

Yannick Estève, Paul Deléglise

University of Le Mans, France

11.1 Introduction

The main weakness of automatic speech-recognition (ASR) systems resides in their lack of robustness to variability. All the knowledge bases used in an ASR system are affected by this problem: the dictionary – that is the list of the words recognizable by the system, along with their pronunciation variants – the language models as well as the acoustic models. Those knowledge bases – most particularly language and acoustic models, of probabilistic essence – are very dependent on the data used to estimate their various parameters. The problem posed by this dependence of probabilistic models on their training corpora is made more significant by the high cost of building such corpora. As a result of that cost, in practice, it is common for probabilistic models to be used in application contexts that differ considerably from the context of their training data.

Such mismatch between training data and application context causes the models to lose some of their precision and predictive power, in turn degrading the quality of speech recognition. This is a well-known problem, which has led to the development of many techniques aiming at lessening its impact. Model adaptation consists in reducing the mismatch between probabilistic models and the data against which they are used.

Noise is a cause of mismatch: it constitutes a variable phenomenon with potentially ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis
Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Publisher Resources

ISBN: 9781118392669Purchase book