CHAPTER 40

image

SPEECH TRANSFORMATIONS

40.1 INTRODUCTION

There are a variety of techniques to modify the speed, pitch, and spectrum of a speech signal. Some methods work directly on the speech wave to modify the time scale or pitch. Other methods are based on analysis–synthesis systems (i.e., vocoders), in which the derived parameters can be adjusted to modify the synthetic output. However, some medium-and high-rate vocoder systems do not explicitly compute the fundamental frequency, which complicates their use for pitch modification.

Speech modification techniques have many applications. For instance, as noted in Chapter 30, pitch and duration must often be modified for concatenative synthesis. Speeding up a voice response system can save time for a busy, impatient user. It may also be a useful addition in speech communication channels subject to fading. Compressing the spectrum could potentially be of help to people with hearing disabilities.

The following three sections explain some of the fundamental issues in speech transformations. This is followed by a study of speech modification in analysis–synthesis systems, that is, channel vocoders, LPC vocoders, and homomorphic vocoders. The chapter concludes with a review of three specific systems: the phase vocoder [4], the Seneff system [20], and the sine-transform coder of Quatieri and McAulay [17].

40.2 TIME-SCALE MODIFICATION

A popular ...

Get Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.