10.2 Variational Bayesian methods: simple case
The integrals which arise in Bayesian inference are, as we have seen, frequently intractable unless we use priors which are jointly conjugate, and these are often difficult to deal with, as in the case of the normal/chi-squared distribution introduced in Section 2.13. The idea of variational Bayesian methods is to approximate the posterior by a density of a simpler form. They can be thought of as extensions of the EM technique discussed in Section 9.2.
When approximating one distribution by another, it is useful to have a measure of the closeness of the approximation, and for this purpose we shall use the Kullback–Leibler divergence or information 1
(cf. Section 3.11; the integral sign denotes a multiple integral when is multidimensional and, of course, is replaced by a summation in discrete cases). This function satisfies
To show this, note that if we use natural logarithms it is easily shown that x – 1 and hence for any densities and ...