O'Reilly logo

Data Science For Dummies, 2nd Edition by Jake Porway, Lillian Pierson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6

Using Clustering to Subdivide Data

IN THIS CHAPTER

check Understanding the basics of clustering

check Clustering your data with the k-means algorithm and kernel density estimation

check Getting to know hierarchical and neighborhood clustering algorithms

check Checking out decision tree and random forest algorithms

Data scientists use clustering to help them divide their unlabeled data into subsets. The basics behind clustering are relatively easy to understand, but things get tricky fast when you get into using some of the more advanced algorithms. In this chapter, I introduce the basics behind clustering. I follow that by introducing several nuanced algorithms that offer clustering solutions to meet your requirements, based on the specific characteristics of your feature dataset.

Introducing Clustering Basics

To grasp advanced methods for use in clustering your data, you should first take a few moments to make sure you have a firm understanding of the basics that underlie all forms of clustering. Clustering is a form of machine learning — the machine in this case is your computer, and learning ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required