Using KMeans to cluster data

Clustering is a very useful technique. Often, we need to divide and conquer when taking actions. Consider a list of potential customers for a business. A business might need to group customers into cohorts, and then departmentalize responsibilities for these cohorts. Clustering can help facilitate the clustering process.

KMeans is probably one of the most well-known clustering algorithms and, in a larger sense, one of the most well-known unsupervised learning techniques.

Getting ready

First, let's walk through some simple clustering, then we'll talk about how KMeans works:

>>> from sklearn.datasets import make_blobs
>>> blobs, classes = make_blobs(500, centers=3)

Also, since we'll be doing some plotting, import matplotlib ...

Get scikit-learn : Machine Learning Simplified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

scikit-learn : Machine Learning Simplified by Raúl Garreta, Guillermo Moncecchi, Trent Hauck, Gavin Hackeling

Using KMeans to cluster data

Getting ready

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly