O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Running k-means to generate the clusters

Run kmeans function using the kcca package to generate five cluster assignments:

# repeat kmeans using the kcca function. Clusters=5 clust1 = kcca(dtMatrix, k = 5, kccaFamily("kmeans")) clust1 > kcca object of family 'kmeans'  >  > call: > kcca(x = dtMatrix, k = 5, family = kccaFamily("kmeans")) >  > cluster sizes: >  > 1 2 3 4 5  > 360 120 152 387 8981

Print the number categorized for each cluster:

table(clust1@cluster) >  > 1 2 3 4 5  > 360 120 152 387 8981

Merge the clusters with the training data, and show some sample records displaying the cluster assigned to each:

 kw_with_cluster2 <- as.data.frame(cbind(OnlineRetail, Cluster = clust1@cluster)) head(kw_with_cluster2) > InvoiceNo StockCode ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required