December 2018
Beginner to intermediate
684 pages
21h 9m
English
The large capacity of neural networks to approximate arbitrary functions greatly increases the risk of overfitting. We have seen for all models so far that there is some form of regularization that modifies the learning algorithm to reduce its generalization error without negatively affecting its training error. Examples include the penalties added to the ridge and lasso regression objectives and the split constraints used with decision trees and tree-based ensemble models.
Frequently, regularization takes the form of a soft constraint on parameter values that trades off some additional bias for lower variance. Sometimes the constraints and penalties are designed to encode prior knowledge. Other times, ...