O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Generating a correlation plot

Here is a more sophisticated visualization which uses ggplot to illustrate how to generate a correlation matrix using shading to indicate the degree of correlation for each of the intersecting variables. Again, the point is to emphasis that you can perform analysis outside of Spark if your sample size is reasonable, and the exact functionality you need is not available in the version of Spark you are running.

require(ggplot2)library(reshape2)cormatrix <- round(cor(samp),2)cormatrix_melt <- melt(cormatrix)head(cormatrix_melt)ggplot(data = cormatrix_melt, aes(x=Var1, y=Var2, fill=value)) + geom_raster() 

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required