Probabilistic clustering with Gaussian Mixture Models

In KMeans, we assume that the variance of the clusters is equal. This leads to a subdivision of space that determines how the clusters are assigned; but, what about a situation where the variances are not equal and each cluster point has some probabilistic association with it?

Getting ready

There's a more probabilistic way of looking at KMeans clustering. Hard KMeans clustering is the same as applying a Gaussian Mixture Model with a covariance matrix, S, which can be factored to the error times of the identity matrix. This is the same covariance structure for each cluster. It leads to spherical clusters.

However, if we allow S to vary, a GMM can be estimated and used for prediction. We'll look ...

Get scikit-learn : Machine Learning Simplified now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.