Probabilistic clustering with Gaussian Mixture Models

In KMeans, we assume that the variance of the clusters is equal. This leads to a subdivision of space that determines how the clusters are assigned; but, what about a situation where the variances are not equal and each cluster point has some probabilistic association with it?

Getting ready

There's a more probabilistic way of looking at KMeans clustering. Hard KMeans clustering is the same as applying a Gaussian Mixture Model with a covariance matrix, S, which can be factored to the error times of the identity matrix. This is the same covariance structure for each cluster. It leads to spherical clusters.

However, if we allow S to vary, a GMM can be estimated and used for prediction. We'll look ...

Get scikit-learn : Machine Learning Simplified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.