A much stronger approach has been proposed by Chawla et al. (in SMOTE: Synthetic Minority Over-sampling Technique, Chawla N. V., Bowyer K. W., Hall L. O., Kegelmeyer W. P., Journal of Artificial Intelligence Research, 16/2002). The algorithm is called Synthetic Minority Over-sampling Technique (SMOTE) and, contrary to the previous one, has been designed to generate new samples that are coherent with the minor class distribution. A full description of the algorithm is beyond the scope of this book (it can be found in the aforementioned paper), however, the main idea is to consider the relationships that exist between samples and create new synthetic points along the segments connecting a group of neighbors. Let's consider ...
SMOTE resampling
Get Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.