Table of Contents
Preface
1
Introduction to Data Imbalance in Machine Learning
Technical requirements
Introduction to imbalanced datasets
Machine learning 101
What happens during model training?
Types of dataset and splits
Cross-validation
Common evaluation metrics
Confusion matrix
ROC
Precision-Recall curve
Relation between the ROC curve and PR curve
Challenges and considerations when dealing with imbalanced data
When can we have an imbalance in datasets?
Why can imbalanced data be a challenge?
When to not worry about data imbalance
Introduction to the imbalanced-learn library
General rules to follow
Summary
Questions
References
2
Oversampling Methods
Technical requirements
What is oversampling?
Random oversampling
Problems with random oversampling ...
Get Machine Learning for Imbalanced Data now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.