Let's suppose that we have a dataset made up of n m-dimensional points drawn from a data generating process, pdata:
In many cases, it's possible to assume that the blobs (that is, the densest and most separated regions) are symmetric around a mean (in general, the symmetry is different for each axis), so that they can be represented as multivariate Gaussian distributions. Under this assumption, we can imagine that the probability of each sample is obtained as a weighted sum of k (the number of clusters) multivariate Gaussians parametrized by the mean vector, μj and the covariance matrix, Σi:
This model is called Gaussian ...