Gaussian mixture

Let's suppose that we have a dataset made up of n m-dimensional points drawn from a data generating process, pdata:

In many cases, it's possible to assume that the blobs (that is, the densest and most separated regions) are symmetric around a mean (in general, the symmetry is different for each axis), so that they can be represented as multivariate Gaussian distributions. Under this assumption, we can imagine that the probability of each sample is obtained as a weighted sum of k (the number of clusters) multivariate Gaussians parametrized by the mean vector, μj and the covariance matrix, Σi:

This model is called Gaussian ...

Get Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.