7.2 Phase Vocoder Basics

The concepts of short-time Fourier analysis and synthesis have been widely described in the literature [Por76, Cro80, CR83]. We will briefly summarize the basics and define our notation of terms for application to digital audio effects.

The short-time Fourier transform (STFT) of the signal x(n) is given by

X(n, k) is a complex number and represents the magnitude |X(n, k)| and phase φ(n, k) of a time-varying spectrum with frequency bin (index) 0 ≤ kN − 1 and time index n. Note that the summation index is m in (7.1). At each time index n the signal x(m) is weighted by a finite length window h(nm). Thus the computation of (7.1) can be performed by a finite sum over m with an FFT of length N. Figure 7.3 shows the input signal x(m) and the sliding window h(nm) for three time indices of n. The middle plot shows the finite length windowed segments x(m) · h(nm). These segments are transformed by the FFT, yielding the short-time spectra X(n, k) given by (7.1). The lower two rows in Figure 7.3 show the magnitude and phase spectra of the corresponding time segments.

Figure 7.3 Sliding analysis window and short-time Fourier transform.

