In Chapter 2, Introduction to Semi-Supervised Learning, we discussed the generative Gaussian mixture model in the context of semi-supervised learning. In this paragraph, we're going to apply the EM algorithm to derive the formulas for the parameter updates.
Let's start considering a dataset, X, drawn from a data generating process, pdata:
We assume that the whole distribution is generated by the sum of k Gaussian distributions so that the probability of each sample can be expressed as follows:
In the previous expression, the ...