Chapter 11. Clustering in Tableau

Oftentimes, you will find yourself wanting to better understand how things relate to one another. What groups of products sell well when paired together? How should I market to certain groups of customers? Are there anomalies in my data? If you’re asking these types of questions, then clustering is a great model to start finding answers. The primary objective of clustering is to partition a dataset into subgroups or clusters. The models achieve this by partitioning the data so that the data points in one cluster are more similar to each other than another cluster’s data points.

There are many different clustering models, each with its own pros and cons. In Tableau, the algorithm that is built in for clustering is called k-means. K-means is a widely used model that provides an automated approach to grouping data. Unlike the other regression models, k-means is also an unsupervised model, which means having a normal distribution is not an assumption for this model.

In this chapter, you will learn how the k-means model works, the difference between supervised and unsupervised models, and how to implement k-means in Tableau.

What Is K-Means Clustering?

K-means clustering is a versatile, unsupervised technique that can be applied to various domains and problems. Examples of how you could use k-means clustering include:

Customer segmentation

K-means clustering is used to group customers based on their purchasing behavior, demographics, or other ...

Get Statistical Tableau now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.