O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Graphically display cluster assignment

Bivariate cluster plots are often useful for seeing how the cluster assignments correlate with an x-y plot of two variables. Each cluster is plotted in a different color.

In Databricks, you can do this easily:

First, run the display command on some of the fitted data. In the code below, I have first extracted a 1,000-row sample. You want the sample to be small enough so that the points on the plot are not too dense.

             tmp <- head(sample(fitted, F, .01),1000)             display(tmp)              #show cluster assignment by 2 variable matrix 

Next, switch to the plot dialog box, Open the Customize Plot dialog. and perform the following graph setup:

  1. Change the graph type to scatter plot.
  2. Drag the prediction to the keys area. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required