Skip to Content
Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition
book

Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition

by Ben Gold, Nelson Morgan, Dan Ellis
August 2011
Beginner to intermediate
688 pages
21h 28m
English
Wiley-Interscience
Content preview from Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition

CHAPTER 30

image

SPEECH SYNTHESIS

30.1 INTRODUCTION

The goal of this chapter1 is to introduce engineering approaches for “talking” machines that can generate spoken utterances without requiring the every possible utterance to be prerecorded. Generally, speech synthesis requires the use of sub-word units, in order to provide the extended or even arbitrary vocabularies required for applications such as text-to-speech (TTS); this is the most common application of speech synthesis. A YTS system operates as a pipeline of processes, taking text as input and producing a digitized speech waveform as output. The pipeline can be described in two main parts: the “front end”, which converts text into some kind of linguistic specification; and the waveform generation component, which takes that linguistic specification and creates an appropriate speech waveform.

The task of the front end is to infer useful information from the text; that is, information that will help in generating an appropriate waveform. The written form of a language does not fully specify the spoken form, so in order to correctly produce the spoken form prior knowledge must be used. Some examples of using prior knowledge to enrich the information encoded in the written form include:

1. Text preprocessing: Ambiguities in the written form, such as abbreviations and acronyms, must be resolved. An example of this is the translation ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Audio Processes

Audio Processes

David Creasey
Audio Source Separation and Speech Enhancement

Audio Source Separation and Speech Enhancement

Emmanuel Vincent, Tuomas Virtanen, Sharon Gannot

Publisher Resources

ISBN: 9780470195369Purchase book