Overfitting is a common issue in deep models. Their extremely high capacity can often become problematic even with very large datasets because the ability to learn the structure of the training set is not always related to the ability to generalize. A deep neural network can easily become an associative memory, but the final internal configuration couldn't be the most suitable to manage samples belonging to the same distribution but was never presented during the training process. It goes without saying that this behavior is proportional to the complexity of the separation hypersurface. A linear classifier has a minimum chance to overfit, and a polynomial classifier is incredibly more prone to do it. A combination ...
Regularization and dropout
Get Mastering Machine Learning Algorithms now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.