O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Examining the distributions by outcome

Next, you can run ggplot to graphically display the errors grouped by outcome. The resulting boxplots show that the three quartiles for diabetes are above the non-diabetic patients. This demonstrates that the model's prediction error runs higher when predicting those patients who actually developed diabetes. This would certainly imply that the model needs to be improved:

library(ggplot2)ggplot(local, aes(factor(outcome),error)) + geom_boxplot() 

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required