13. Imbalanced Datasets

Overview

By the end of this chapter, you will be able to identify use cases where datasets are likely to be imbalanced; formulate strategies for dealing with imbalanced datasets; build classification models, such as logistic regression models, after balancing datasets; and analyze classification metrics to validate whether adopted strategies are yielding the desired results.

In this chapter, you will be dealing with imbalanced datasets, which are very prevalent in real-life scenarios. You will be using techniques such as SMOTE, MSMOTE, and random undersampling to address imbalanced datasets.

Introduction

In the previous chapter, Chapter 12, Feature Engineering, where we dealt with data points related to dates, we ...

Get The Data Science Workshop - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.