4
Speech Signal Analysis and Modelling
4.1 Introduction
The speech signal has been studied for various reasons and applications by many researchers for many years. Some studies broke down the speech signal into its smallest portions called phonemes. Here, we will describe the speech signal in terms of its general characteristics. Speech signals can be classified into voiced or unvoiced. A voiced speech segment is known by its relatively high energy content but, more importantly, it contains periodicity which is called the pitch of voiced speech. The unvoiced part of speech on the other hand looks more like random noise with no periodicity. However, there are some parts of speech that are neither voiced nor unvoiced, but a mixture of the two. These are usually called the transition regions, where there is a change either from voiced to unvoiced or unvoiced to voiced. The amplitude versus time plots of typical voiced and unvoiced speech are shown in Figure 4.1 (Note: The unvoiced sound has been amplified five times).
In some speech coding schemes the frequency domain representation of the speech signal is necessary. For this purpose, the short-time Fourier transform is very useful. The short-time spectral transformation is also important to look at a segment of the speech signal and determine features that are not obvious from the time domain representation.
4.2 Short-Time Spectral Analysis
The short-time Fourier transform plays a fundamental role in frequency domain analysis of ...
Get Digital Speech: Coding for Low Bit Rate Communication Systems, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.