O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Parametric Time-Frequency Domain Spatial Audio

Book Description

A comprehensive guide that addresses the theory and practice of spatial audio

This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in.

Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming—covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed for such processing, and provides an overview to existing research. It also shows recent up-to-date projects and commercial applications built on top of the systems.

  • Provides an in-depth presentation of the principles, past developments, state-of-the-art methods, and future research directions of spatial audio technologies
  • Includes contributions from leading researchers in the field
  • Offers MATLAB codes with selected chapters

An advanced book aimed at readers who are capable of digesting mathematical expressions about digital signal processing and sound field analysis, Parametric Time-frequency Domain Spatial Audio is best suited for researchers in academia and in the audio industry.

Table of Contents

  1. List of Contributors
  2. Preface
    1. Notes
  3. About the Companion Website
  4. Part I Analysis and Synthesis of Spatial Sound
    1. 1 Time–Frequency Processing: Methods and Tools
      1. 1.1 Introduction
      2. 1.2 Time–Frequency Processing
      3. 1.3 Processing of Spatial Audio
      4. Note
      5. References
    2. 2 Spatial Decomposition by Spherical Array Processing
      1. 2.1 Introduction
      2. 2.2 Sound Field Measurement by a Spherical Array
      3. 2.3 Array Processing and Plane-Wave Decomposition
      4. 2.4 Sensitivity to Noise and Standard Regularization Methods
      5. 2.5 Optimal Noise-Robust Design
      6. 2.6 Spatial Aliasing and High Frequency Performance Limit
      7. 2.7 High Frequency Bandwidth Extension by Aliasing Cancellation
      8. 2.8 High Performance Broadband PWD Example
      9. 2.9 Summary
      10. 2.10 Acknowledgment
      11. References
    3. 3 Sound Field Analysis Using Sparse Recovery
      1. 3.1 Introduction
      2. 3.2 The Plane-Wave Decomposition Problem
      3. 3.3 Bayesian Approach to Plane-Wave Decomposition
      4. 3.4 Calculating the IRLS Noise-Power Regularization Parameter
      5. 3.5 Numerical Simulations
      6. 3.6 Experiment: Echoic Sound Scene Analysis
      7. 3.7 Conclusions
      8. Appendix
      9. References
  5. Part II Reproduction of Spatial Sound
    1. 4 Overview of Time–Frequency Domain Parametric Spatial Audio Techniques
      1. 4.1 Introduction
      2. 4.2 Parametric Processing Overview
      3. References
    2. 5 First-Order Directional Audio Coding (DirAC)
      1. 5.1 Representing Spatial Sound with First-Order B-Format Signals
      2. 5.2 Some Notes on the Evolution of the Technique
      3. 5.3 DirAC with Ideal B-Format Signals
      4. 5.4 Analysis of Directional Parameters with Real Microphone Setups
      5. 5.5 First-Order DirAC with Monophonic Audio Transmission
      6. 5.6 First-Order DirAC with Multichannel Audio Transmission
      7. 5.7 DirAC Synthesis for Headphones and for Hearing Aids
      8. 5.8 Optimizing the Time–Frequency Resolution of DirAC for Critical Signals
      9. 5.9 Example Implementation
      10. 5.10 Summary
      11. References
    3. 6 Higher-Order Directional Audio Coding
      1. 6.1 Introduction
      2. 6.2 Sound Field Model
      3. 6.3 Energetic Analysis and Estimation of Parameters
      4. 6.4 Synthesis of Target Setup Signals
      5. 6.5 Subjective Evaluation
      6. 6.6 Conclusions
      7. Note
      8. References
    4. 7 Multi-Channel Sound Acquisition Using a Multi-Wave Sound Field Model
      1. 7.1 Introduction
      2. 7.2 Parametric Sound Acquisition and Processing
      3. 7.3 Multi-Wave Sound Field and Signal Model
      4. 7.4 Direct and Diffuse Signal Estimation
      5. 7.5 Parameter Estimation
      6. 7.6 Application to Spatial Sound Reproduction
      7. 7.7 Summary
      8. Notes
      9. References
    5. 8 Adaptive Mixing of Excessively Directive and Robust Beamformers for Reproduction of Spatial Sound
      1. 8.1 Introduction
      2. 8.2 Notation and Signal Model
      3. 8.3 Overview of the Method
      4. 8.4 Loudspeaker-Based Spatial Sound Reproduction
      5. 8.5 Binaural-Based Spatial Sound Reproduction
      6. 8.6 Conclusions
      7. References
    6. 9 Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization
      1. 9.1 Introduction
      2. 9.2 Spectrogram Factorization
      3. 9.3 Array Signal Processing and Spectrogram Factorization
      4. 9.4 Applications of Spectrogram Factorization in Spatial Audio
      5. 9.5 Discussion
      6. 9.6 Matlab Example
      7. Note
      8. References
  6. Part III Signal-Dependent Spatial Filtering
    1. 10 Time–Frequency Domain Spatial Audio Enhancement
      1. 10.1 Introduction
      2. 10.2 Signal-Independent Enhancement
      3. 10.3 Signal-Dependent Enhancement
      4. References
    2. 11 Cross-Spectrum-Based Post-Filter Utilizing Noisy and Robust Beamformers
      1. 11.1 Introduction
      2. 11.2 Notation and Signal Model
      3. 11.3 Estimation of the Cross-Spectrum-Based Post-Filter
      4. 11.4 Implementation Examples
      5. 11.5 Conclusions and Further Remarks
      6. 11.6 Source Code
      7. Note
      8. References
    3. 12 Microphone-Array-Based Speech Enhancement Using Neural Networks
      1. 12.1 Introduction
      2. 12.2 Time–Frequency Masks for Speech Enhancement Using Supervised Learning
      3. 12.3 Artificial Neural Networks
      4. 12.4 Mask Learning: A Simulated Example
      5. 12.5 Mask Learning: A Real-World Example
      6. 12.6 Conclusions
      7. 12.7 Source Code
      8. Notes
      9. References
  7. Part IV Applications
    1. 13 Upmixing and Beamforming in Professional Audio
      1. 13.1 Introduction
      2. 13.2 Stereo-to-Multichannel Upmix Processor
      3. 13.3 Digitally Enhanced Shotgun Microphone
      4. 13.4 Surround Microphone System Based on Two Microphone Elements
      5. 13.5 Summary
      6. References
    2. 14 Spatial Sound Scene Synthesis and Manipulation for Virtual Reality and Audio Effects
      1. 14.1 Introduction
      2. 14.2 Parametric Sound Scene Synthesis for Virtual Reality
      3. 14.3 Spatial Manipulation of Sound Scenes
      4. 14.4 Summary
      5. References
    3. 15 Parametric Spatial Audio Techniques in Teleconferencing and Remote Presence
      1. 15.1 Introduction and Motivation
      2. 15.2 Background
      3. 15.3 Immersive Audio Communication System (ImmACS)
      4. 15.4 Capture and Reproduction of Crowded Acoustic Environments
      5. 15.5 Conclusions
      6. Notes
      7. References
  8. Index
  9. EULA