O'Reilly logo

Mastering Predictive Analytics with R - Second Edition by Rui Miguel Forte, James D. Miller

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

In this chapter, we explored the fundamental ideas surrounding issues and concerns with data quality and how to categorize quality issues by their type, as well as presented ideas for tidying up your data.

In order to compare the performance of the different models that one may create, we went on to establish some fundamental notions of model performance, such as the mean squared error (MSE) for regression and the classification error rate for classification.

We also introduced cross-validation as a generic assessment technique to be used in cases where there is a limited amount of data available.

Finally, learning curves were discussed as a way to judge the ability of a model to improve its scores or ability to learn.

With a firm grounding ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required