3

Images

Regularization

In high-dimensional settings, where you have many possible signals that can be included in your model, you need to be careful to select the best model for predicting future data and avoid overfit. To do this, you first use recipes that provide a good array of candidate models. You then select among these candidates to minimize estimates for the error rate when predicting on new data. This chapter introduces the key tools for such high-dimensional modeling.

Out-of-Sample Performance

In the previous chapter, we introduced deviance as a measure of how tightly your model fits the training data. When you apply your models for prediction ...

Get Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.