O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Clustering

Clustering is a method which groups data into different classes, so that each class is similar to each other. There are various methods that can be used to define similarity. K-Means clustering is probably the most popular method of clustering. This method uses distances measured to assign data observations to the closest class. Clustering is often used in marketing in order to develop different customer segments.

Clustering is an unsupervised algorithm and is subjective. You can specify beforehand how many groups you wish to cluster into. This number is somewhat arbitrary, and if the goal is interpretability, it can yield to different interpretations.

Scatterplots are often used to show data clusters using only two variables (one ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required