Articulatory Speech Synthesis from the Fluid Dynamics of the Vocal Apparatus

Book description


This book addresses the problem of articulatory speech synthesis based on computed vocal tract geometries and the basic physics of sound production in it. Unlike conventional methods based on analysis/synthesis using the well-known source filter model, which assumes the independence of the excitation and filter, we treat the entire vocal apparatus as one mechanical system that produces sound by means of fluid dynamics. The vocal apparatus is represented as a three-dimensional time-varying mechanism and the sound propagation inside it is due to the non-planar propagation of acoustic waves through a viscous, compressible fluid described by the Navier-Stokes equations.

We propose a combined minimum energy and minimum jerk criterion to compute the dynamics of the vocal tract during articulation. Theoretical error bounds and experimental results show that this method obtains a close match to the phonetic target positions while avoiding abrupt changes in the articulatory trajectory. The vocal folds are set into aerodynamic oscillation by the flow of air from the lungs. The modulated air stream then excites the moving vocal tract. This method shows strong evidence for source-filter interaction.

Based on our results, we propose that the articulatory speech production model has the potential to synthesize speech and provide a compact parameterization of the speech signal that can be useful in a wide variety of speech signal processing problems.

Table of Contents: Introduction / Literature Review / Estimation of Dynamic Articulatory Parameters / Construction of Articulatory Model Based on MRI Data / Vocal Fold Excitation Models / Experimental Results of Articulatory Synthesis / Conclusion

Table of contents

  1. Preface
  2. Introduction
    1. History of Speech Synthesis
    2. Speech Production
    3. Contributions
    4. Organization of the Book
  3. Literature Review
    1. Overview of Speech Synthesis Techniques
      1. Concatenative Synthesis
      2. Formant Synthesis
      3. Articulatory Synthesis
    2. Overview of Speech Production Model
      1. Source-Filter Speech Production Model
      2. Fricative Model
      3. Unvoiced Speech Sound Production Model
    3. Overview of Articulatory Speech Model
      1. Coker's Model
      2. Synthesis of Speech Phonemes
      3. Mermelstein's Model
      4. Task-Dynamic Model
    4. Overview of the Motor Control of the Articulator
      1. A Dynamic Model of Articulation
      2. Motor Control Based on Minimum Cost Principles
    5. Summary
  4. Estimation of Dynamic Articulatory Parameters
    1. Cubic Spline Method
    2. Review of the Signal Representation Techniques
      1. Introduction
      2. L2 Space
      3. Convolution-Based Signal Representations
      4. Interpolation and Quasi-Interpolation
      5. Convolution-Based Least Squares
      6. Strang-Fix Conditions
    3. Pointwise Error Analysis
      1. Interpolation Error
      2. Least Squares Error
    4. L2 Error Analysis
      1. L2 Error of Quasi-Interpolation
      2. L2 Error of the LS Approximation
      3. Comparison
    5. Experimental Results
    6. Discussion
    7. Future Work
    8. Summary
  5. Construction of Articulatory Model Based on MRI Data
    1. Problem Formulation
    2. Vocal Cords Models
    3. Multi-Mass Model
    4. Simulation Result and Future Work
  6. Vocal Fold Excitation Models
    1. Parametric Models
      1. Rosenberg's Model
      2. Titze's Model
    2. Mechanical Model
      1. Two-Mass Model
      2. M-Mass Model
    3. Simulation Results
    4. Discussion
    5. Summary
  7. Experimental Results of Articulatory Synthesis
    1. Governing Equations, Fluid Dynamics Analysis
    2. Synthesized Waveform
    3. Speech Analysis Results
      1. LPC Spectrum and the Short-Time Power Spectrum
      2. Spectrogram
    4. Analysis of the Velocity, Vorticity, and Pressure Fields
    5. Summary (1/2)
    6. Summary (2/2)
  8. Conclusion
  9. Bibliography (1/2)
  10. Bibliography (2/2)
  11. Authors' Biographies

Product information

  • Title: Articulatory Speech Synthesis from the Fluid Dynamics of the Vocal Apparatus
  • Author(s): Stephen Levinson, Don Davis, Scott Slimon, Jun Huang
  • Release date: July 2012
  • Publisher(s): Morgan & Claypool Publishers
  • ISBN: 9781598291797