Voice Activity Detection, Noise Estimation, and Adaptive Filters for Acoustic Signal Enhancement

Rainer Martin, Dorothea Kolossa

Ruhr-Universität Bochum, Germany

4.1 Introduction

The presence of acoustic noise degrades automatic speech-recognition (ASR) performance as it adds irrelevant information to the target signal. In the best case, this irrelevant information does not disturb the speech recognizer; in the worst case, it leads to a complete mismatch of the acoustic signal and the signal model of the recognizer. One widely used approach to improve the performance of ASR is to filter the acoustic signal such that the amount of irrelevant information is reduced and the match of the signal with its model is improved.

In the past 20-some years, many different filtering methods for noise reduction have been proposed, either using a single signal or multiple microphone signals. Although beam-forming methods based on multiple microphone signals yield larger improvements than single-microphone processing methods, the latter are very widely used.

On the one hand, single-channel approaches are relatively easy to apply as their microphone arrangement requires less space and they need in general less hardware and computational resources. On the other hand, single-channel methods do not provide a spatial selectivity and are restricted in their ability to remove time-varying noise components. Therefore, the complete restoration of the undisturbed speech signal, as desirable as it would ...

Get Techniques for Noise Robustness in Automatic Speech Recognition now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.