Skip to Content
Techniques for Noise Robustness in Automatic Speech Recognition
book

Techniques for Noise Robustness in Automatic Speech Recognition

by Rita Singh, Tuomas Virtanen, Bhiksha Raj
November 2012
Intermediate to advanced
514 pages
17h 40m
English
Wiley
Content preview from Techniques for Noise Robustness in Automatic Speech Recognition

6

Microphone Arrays

John McDonough1, Kenichi Kumatani2

1Carnegie Mellon University, USA 2Disney Research, USA

This contribution takes as its objective the class of techniques suitable for performing speech recognition, not on the signal capture by a single microphone, but on that obtained by combining the signals from several microphones. The techniques discussed here differ from those presented in Chapter 5 in that they are based on the pair of assumptions that:

1. The geometry of the array of microphones is fixed and known.
2. The position of the active speakers relative to the array are known or can be accurately estimated.

Such techniques—known collectively as beamforming—have been the subject of intense interest in recent years within the acoustic array processing research community. Unfortunately, such techniques have been largely ignored in the mainstream automatic speech-recognition field, although this may rapidly change given the recent release and widespread popularity of the Microsoft Kinect® platform. The simplest of beamforming algorithms, the delay-and-sum beamformer, uses only this geometric knowledge—that is the arrangement of the microphones and the speaker's position—to compensate for the time delays of the signals arriving at each sensor and then additively combine them. More sophisticated adaptive beamformers minimize the total output power of the array under the constraint that the desired source must be unattenuated. The conventional adaptive beamforming ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot
Parametric Time-Frequency Domain Spatial Audio

Parametric Time-Frequency Domain Spatial Audio

Ville Pulkki, Symeon Delikaris-Manias, Archontis Politis
Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong

Publisher Resources

ISBN: 9781118392669Purchase book