O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

K-means clustering of terms

Now we can cluster the term document matrix using k-means. For illustration purposes, we will specify that five clusters be generated:

kmeans5 <- kmeans(dtms, 5)

Once k-means is done, we will append the cluster number to the original data, and then create five subsets based upon the cluster:

kw_with_cluster <- as.data.frame(cbind(OnlineRetail, Cluster = kmeans5$cluster)) # subset the five clusters cluster1 <- subset(kw_with_cluster, subset = Cluster == 1) cluster2 <- subset(kw_with_cluster, subset = Cluster == 2) cluster3 <- subset(kw_with_cluster, subset = Cluster == 3) cluster4 <- subset(kw_with_cluster, subset = Cluster == 4) cluster5 <- subset(kw_with_cluster, subset = Cluster == 5)

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required