Skip to Content
Techniques for Noise Robustness in Automatic Speech Recognition
book

Techniques for Noise Robustness in Automatic Speech Recognition

by Rita Singh, Tuomas Virtanen, Bhiksha Raj
November 2012
Intermediate to advanced
514 pages
17h 40m
English
Wiley
Content preview from Techniques for Noise Robustness in Automatic Speech Recognition

7

From Signals to Speech Features by Digital Signal Processing

Matthias Wölfel

Pforzheim University, Germany

7.1 Introduction

Acoustic classification of speech signals as well as some speech feature-enhancement techniques require that the speech waveform s(t) is processed to get a sequence of feature vectors—the so called speech features—of a relative small number of dimensions. This reduction is necessary to not waste resources by representing irrelevant information and to prevent the curse of dimensionality1. The transformation of the speech waveform into a set of dimension-reduced features is known as speech feature extraction, acoustic preprocessing, or front-end processing.

The set of transformations has to be carefully chosen such that the resulting features will contain only relevant information to perform the desired task. Feature extraction as applied in automatic speech recognition (ASR) systems aims to preserve the information needed to determine the phonetic class while being invariant to other factors including speaker differences such as accent, emotions, fundamental frequency (in the case of nontonal languages), or speaking rate or other distortion such as background noise, channel effects, or reverberation. For other systems, different information might be needed. For example, in speaker verification one is interested in keeping the speaker-specific characteristics. Note that the correct choice of feature transformation and reduction is critical, because if useful ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis
Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Publisher Resources

ISBN: 9781118392669Purchase book