16Speech Technologies

Speech technology covers applications such as the recognition or synthesis of speech, speaker recognition, and optimized coding and enhancement of speech. Major research efforts have been invested in speech technology both in academia and in industry, which have led to several breakthroughs. After several decades of research, coding, synthesis, and recognition of speech are used extensively in several applications. The widespread adoption of speech technologies has been enabled by the advent of sufficiently powerful and relatively inexpensive digital processors.

Many user interfaces based on speech exist. A computer may take commands by recording the speech of the user, thus requiring speech recognition abilities, and it may deliver messages to the user by producing intelligible speech, which requires speech synthesis technology. This interaction was shown conceptually in Figure I.5 on page 5. In mobile communication, the goal is often to present speech at as low a bit rate as possible, which has led to many methods for speech coding, where the special characteristics of speech signals have been taken into account.

This chapter provides a very brief overview of different technologies in speech coding, synthesis, and recognition. The focus is very much on acoustics, signal processing, and audio, and the linguistic and statistical aspects are, in many places, treated superficially. Overall, the aim of this chapter is to give a general description of the main ...

Get Communication Acoustics: An Introduction to Speech, Audio and Psychoacoustics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Communication Acoustics: An Introduction to Speech, Audio and Psychoacoustics by Ville Pulkki, Matti Karjalainen

16Speech Technologies

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly