5. Performing Your First Cluster Analysis

Overview

This chapter will introduce you to unsupervised learning tasks, where algorithms have to automatically learn patterns from data by themselves as no target variables are defined beforehand. We will focus specifically on the k-means algorithm, and see how to standardize and process data for use in cluster analysis.

By the end of this chapter, you will be able to load and visualize data and clusters with scatter plots; prepare data for cluster analysis; perform centroid clustering with k-means; interpret clustering results and determine the optimal number of clusters for a given dataset.

Introduction

The previous chapters introduced you to very popular and extremely powerful machine learning ...

Get The Data Science Workshop - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.