June 2017
Beginner to intermediate
576 pages
15h 22m
English
Here is a more sophisticated visualization which uses ggplot to illustrate how to generate a correlation matrix using shading to indicate the degree of correlation for each of the intersecting variables. Again, the point is to emphasis that you can perform analysis outside of Spark if your sample size is reasonable, and the exact functionality you need is not available in the version of Spark you are running.
require(ggplot2)library(reshape2)cormatrix <- round(cor(samp),2)cormatrix_melt <- melt(cormatrix)head(cormatrix_melt)ggplot(data = cormatrix_melt, aes(x=Var1, y=Var2, fill=value)) + geom_raster()
