9.2 Motivation and details

As mentioned above, the scheme for joint-coding of audio source signals, shown in Figure 9.2, is based on transmission of the sum of the audio source signals,

images

where M is the number of source signals and si(n) are the individual source signals.

Similar to spatial audio coding techniques, this method relies on the assumption that the perceived auditory spatial image is largely determined by the inter-channel time difference (ICTD), inter-channel level difference (ICLD), and inter-channel coherence (ICC) between the rendered audio channels. Therefore, as opposed to requiring ‘clean’ source signals si(n) as mixer input in Figure 9.1, only signals ŝi(n) are required that result in similar ICTD, ICLD, and ICC at the mixer output as for the case of supplying the real source signals si(n) to the mixer. There are three goals for the generation of ŝi(n):

  • If ŝi(n) are supplied to a mixer, the mixer output channels will have approximately the same spatial cues (ICLD, ICTD, ICC) as if si(n) were supplied to the mixer.
  • ŝi(n) are to be generated with as little as possible information about the original source signals si(n) (because the goal is to have low bitrate side information).
  • ŝi(n) are generated from the transmitted sum signal s(n) such that a minimum amount of signal distortion is introduced.

Figure 9.3 A mixer for generating stereo signals given a number of ...

Get Spatial Audio Processing: MPEG Surround and Other Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.