Overfitting and cross validation

If you remember from Chapters 2, 3, and 4, one of the problems with our methodology when building models was that we were guilty of overfitting. Overfitting, the bane of predictive analytics, is what happens when we build a model that does a great job with past data but then falls apart when new data is introduced. This phenomenon is not just for data science; it happens a lot in our society: Professional athletes get lucrative contracts and then fail to live up to their prior performances; fund managers get hefty salary bumps because of last year's performance, and the list goes on.

Cross validation – train versus test

Unlike the Yankees, who never seem to learn, our profession has learned from its mistakes and ...

Get Mastering .NET Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.