Chapter 3
Dealing with Model Overfitting and Underfitting
IN THIS CHAPTER
Defining overfitting and underfitting
Considering model problem sources
Understanding the role of features
A model is the description of the data points in the form of an algorithm (often represented by a mathematical function). Book 3 discusses various kinds of modeling associated with particular data point patterns. For example, data points that form a straight line rely on linear regression. The purpose of creating a model is to either predict the location of future data points or to categorize data based on where it falls within the model. However, a model is only as good as the underlying algorithm. An algorithm that follows the original data points too closely overfits the curve to the data. An algorithm that doesn’t follow the original data points well enough underfits the curve to the data. Of course, overfitting and underfitting are both problems, which is why you need this chapter. Unless a model runs true to the data, anything you use the model for is suspect.
After you know why overfitting and underfitting occur, you need to consider the sources of these two problems. In some cases, the problem ...
Get Data Science Programming All-in-One For Dummies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.