4.1 Introduction

The concept of spatial audio coding is to represent two or more audio channels by means of a down-mix, accompanied by parameters to model the spatial attributes of the original audio signals that are lost by the down-mix process. These ‘spatial parameters’ capture the perceptually-relevant spatial attributes of an auditory scene and provide means to store, process and reconstruct the original spatial image.

In this chapter the concept of spatial audio coding is explained. The first implementations of spatial audio coding techniques employed a single audio channel as down-mix. This approach is also denoted binaural cue coding (BCC). The spatial audio coding approach and concepts using a single audio down-mix channel (BCC) are explained in detail in the current chapter. The extension to multiple down-mix channels is explained in the context of MPEG Surround in Chapter 6.

Figure 4.1 shows a BCC encoder and decoder. As indicated in the figure, the input audio channels xc(n) (1 ≤ c ≤ C) are down-mixed to one single audio channel s(n), denoted down-mix signal. As ‘perceptually relevant differences’ between the audio channels, inter-channel time difference (ICTD), inter-channel level difference (ICLD), and inter-channel coherence (ICC), are estimated as a function of frequency and time and transmitted as side information to the decoder. The decoder generates its output channels c(n) (1 ≤ c ≤ C) such that ICTD, ICLD, and ICC between the channels approximate those of the ...

Get Spatial Audio Processing: MPEG Surround and Other Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.