Chapter 12Grouping Data with Clustering

In Chapter 11, we introduced association rules, the first of the two unsupervised machine learning approaches that we cover in this book. In that approach, the objective was to develop a set of rules that describe the patterns that exist between events or items in a transaction set. In this chapter, we introduce the second unsupervised machine learning approach—clustering. With clustering, the objective is to find interesting ways to group items based on some measure of similarity. There are several real-world applications of clustering. Most often we see clustering applied to problems such as customer segmentation based on demographics or purchase behavior and anomalous network activity detection. As part of our discussion on clustering, we will introduce the basic idea behind clustering, discuss the different ways to describe approaches to clustering, explore the mechanics of a common clustering algorithm (-means clustering), and illustrate how to cluster data in R using the -means clustering algorithm.

By the end of this chapter, you will have learned the following:

The basic idea behind clustering as an unsupervised machine learning approach
How the -means clustering algorithm works
How to segment data using the -means algorithm in R ...

Get Practical Machine Learning in R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Practical Machine Learning in R by Fred Nwanganga, Mike Chapple

Chapter 12Grouping Data with Clustering

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly