4.4 G.711 APPENDIX-II VAD/CNG ALGORITHM
G.729B is the VAD/CNG recommendation that works with the G.729 and G.729A codecs. G.711 Appendix-II (referred to as VAD-II) also uses a part of the G.729B with slight deviations in some parameter calculations. In G.729AB, various modules and parameters are shared between VAD and regular speech compression operations. In VAD-II, the operations of VAD/CNG and the G.711 codec are independent. A detailed description is available in [ITU-T-G.711 (2000), ITU-T-G.729B (1996), Kondoz (1999), Goldberg et al. (2000)]. VAD detection goes through several algorithmic steps as well as the decision process. The main operations summary is given here.
- Preprocessing: The VAD algorithm takes the signal through preprocessing. The input signal is preprocessed by a first-order high-pass infinite impulse response (IIR) filter to remove an unwanted low-frequency component and any impulse noise spikes.
- Autocorrelation: Short-term prediction or LPC of the speech signal is performed once per speech frame using autocorrelation with a 25-ms asymmetric window. The analysis window consists of two parts. The first part is half of a Hamming window, and the second part is a quarter of a cosine function cycle. The analysis window applies a 20-ms duration of previous samples and a 5-ms duration of look-ahead samples, which maintains compatibility with G.729 implementations that account for the algorithmic delay of 5 ms at the encoder stage. A set of 11 coefficients are computed ...