Using k-means with public datasets

In what follows, we are going to learn more about partition clustering with k-means while exploring a dataset from the cluster.datasets package. This package contains datasets that were published in the book, Clustering algorithms, by Hartigan (1975), with examples of analyses. So let's start by installing this dataset on your machine, and loading it.

install.packages("cluster.datasets")
library(cluster.datasets)

Understanding the data with the all.us.city.crime.1970 dataset

We will first focus on getting to know the data, scaling the data to a common metric, and cluster interpretability. Our first exploration will concern the crime rates among different US cities in 1970. The dataset all.us.city.crime.1970 affords ...

Get R: Predictive Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.