Using k-means with public datasets
In what follows, we are going to learn more about partition clustering with k-means while exploring a dataset from the cluster.datasets
package. This package contains datasets that were published in the book, Clustering algorithms, by Hartigan (1975), with examples of analyses. So let's start by installing this dataset on your machine, and loading it.
install.packages("cluster.datasets") library(cluster.datasets)
Understanding the data with the all.us.city.crime.1970 dataset
We will first focus on getting to know the data, scaling the data to a common metric, and cluster interpretability. Our first exploration will concern the crime rates among different US cities in 1970. The dataset all.us.city.crime.1970
affords ...
Get R: Predictive Analysis now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.