5 Regularization with Data

Even though there are plenty of regularization methods for models (with each model having a unique set of hyperparameters), sometimes, the most effective regularization comes from the data itself. Indeed, sometimes, even the most powerful model can’t have good performance if the data is not transformed properly beforehand.

In this chapter, we’ll look at some methods that help regularize models from data:

Hashing high cardinality features
Aggregating features
Undersampling an imbalanced dataset
Oversampling an imbalanced dataset
Resampling imbalanced data with SMOTE

Technical requirements

In this chapter, you will apply several tricks to data, as well as resample datasets or download new data via the command line. ...

Get The Regularization Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Regularization Cookbook by Vincent Vandenbussche

5

Regularization with Data

Technical requirements

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly