In Chapter 19, we described spectral representations that are based on the signal and (to some extent) some of the properties of human hearing, in particular the property of requiring less frequency resolution at high frequencies. In Chapter 20, we showed that cepstral processing could provide a smoothed spectral representation that is useful for many speech applications. In both cases, however, we made no explicit use of our knowledge of how the excitation spectrum is shaped by the vocal tract. As noted in Chapters 10 and 11, speech can be modeled as being produced by a periodic or noiselike source that is driving a nonuniform tube. It can be shown that basing the analysis (in a very general way) on such a production model leads to a spectral estimate that is both succinct and smooth, and for which the nature of the smoothness has a number of desirable properties. This is the main topic of this chapter.^{1}

In Chapter 10, we showed that a discrete model of a lossless uniform tube led to an input–output relationship for an excitation at one end and the other end closed (see Eqs. 10.21 and 10.22, and Figs. 10.5 and 10.6). For the case in which the far end of the tube is open, we noted that the complex poles of the tube transfer function would be on the unit circle at frequencies given by

We further noted that, ...

Start Free Trial

No credit card required