5
Regularization with Data
Even though there are plenty of regularization methods for models (with each model having a unique set of hyperparameters), sometimes, the most effective regularization comes from the data itself. Indeed, sometimes, even the most powerful model can’t have good performance if the data is not transformed properly beforehand.
In this chapter, we’ll look at some methods that help regularize models from data:
- Hashing high cardinality features
- Aggregating features
- Undersampling an imbalanced dataset
- Oversampling an imbalanced dataset
- Resampling imbalanced data with SMOTE
Technical requirements
In this chapter, you will apply several tricks to data, as well as resample datasets or download new data via the command line. ...
Get The Regularization Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.