Skip to Content
Techniques for Noise Robustness in Automatic Speech Recognition
book

Techniques for Noise Robustness in Automatic Speech Recognition

by Rita Singh, Tuomas Virtanen, Bhiksha Raj
November 2012
Intermediate to advanced
514 pages
17h 40m
English
Wiley
Content preview from Techniques for Noise Robustness in Automatic Speech Recognition

9

Feature Compensation

Jasha Droppo

Microsoft Research, USA

9.1 Life in an Ideal World

People convey linguistic messages by generating acoustic speech signals. In an ideal world, we could record that signal and derive acoustic features that contain all of the necessary information to achieve perfect recognition accuracy, and nothing else.

In our world, the acoustic features are computed from acoustic signals recorded by a microphone, and the information we need is obscured by noise and other irrelevant variabilities. To make matters worse, these features often suffer from linear and nonlinear channel effects, reverberation, and a significant amount of additive noise. Even in the absence of these distortions, the speech portion of the signal itself contains more information than what was said, including how it was said and who said it.

Figure 9.1 shows the connection between the ideal speech features that we want, the clean speech features that we may be able to get by carefully controlling the environmental conditions at the time of capture, and the noisy speech that we must often tolerate.

Figure 9.1 The goal of feature compensation is to recover more ideal speech features from observed noisy speech features.

ch09fig001.eps

This chapter focuses on feature-enhancement techniques, which strive to remove extraneous information and distortion from a sequence of speech-recognition features, while ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis
Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Publisher Resources

ISBN: 9781118392669Purchase book