CHAPTER 26

image

STATISTICAL MODEL TRAINING

26.1 INTRODUCTION

In the previous chapter, we introduced the notion of statistical models and sequence recognition; we further introduced the common assumptions of conditional independence that lead to the particular form of generative statistical model1 called a hidden Markov model (HMM). We then showed how such models could be used to compute the likelihood of the sequence of feature vectors having been produced by each hypothetical model, given some assumptions of conditional independence. This likelihood was either a total likelihood (using the forward recursion), taking into account all possible state sequences associated with the model, or a Viterbi approximation, only taking into account the most likely state sequence. Further assuming that the language model parameters were separable from the acoustic model parameters, we showed that the Bayes rule gave us the prescription for combining the two models to indicate the model (or sequence of models) that gives the minimum probability of error.

A key component in this development was the integration of local probability values over the sequence; essentially this was a local product of state emission and transition probabilities with a cumulative value computed from legal predecessor states. In other words, we derived approaches for determining complete sequence likelihoods given all the ...

Get Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.