Skip to Content
Machine Learning with Python Cookbook
book

Machine Learning with Python Cookbook

by Chris Albon
March 2018
Intermediate to advanced content levelIntermediate to advanced
364 pages
7h 12m
English
O'Reilly Media, Inc.
Content preview from Machine Learning with Python Cookbook

Chapter 19. Clustering

19.0 Introduction

In much of this book we have looked at supervised machine learning—where we have access to both the features and the target. This is, unfortunately, not always the case. Frequently, we run into situations where we only know the features. For example, imagine we have records of sales from a grocery store and we want to break up sales by whether or not the shopper is a member of a discount club. This would be impossible using supervised learning because we don’t have a target to train and evaluate our models. However, there is another option: unsupervised learning. If the behavior of discount club members and nonmembers in the grocery store is actually disparate, then the average difference in behavior between two members will be smaller than the average difference in behavior between a member and nonmember shopper. Put another way, there will be two clusters of observations.

The goal of clustering algorithms is to identify those latent groupings of observations, which if done well, allow us to predict the class of observations even without a target vector. There are many clustering algorithms and they have a wide variety of approaches to identifying the clusters in data. In this chapter, we will cover a selection of clustering algorithms using scikit-learn and how to use them in practice.

19.1 Clustering Using K-Means

Problem

You want to group observations into k groups.

Solution

Use k-means clustering:

# Load libraries
from sklearn ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning with Python Cookbook, 2nd Edition

Machine Learning with Python Cookbook, 2nd Edition

Kyle Gallatin, Chris Albon

Publisher Resources

ISBN: 9781491989371Errata Page