June 2017
Beginner to intermediate
576 pages
15h 22m
English
One way to plot this in Spark is to construct a histogram of the probabilities, using the midpoints of the binned range (centroids) for the x axis and the raw counts for each bin plotted on the y axis.
x=SparkR::histogram(preds_train,preds_train$prediction, nbins = 100) x$centroids=round(x$centroids,2) display(x)
After the display command has run, click on the Plot Icon (2nd Icon at the bottom left), and click on "Plot Options". The customize plot screen will then display:
Centroids represent the midpoints of the bars. In the code above, we rounded them ...