How it works...
As can be seen from the last barplot, the dataset is very imbalanced: some classes have as many as 2,000 sample images corresponding to it, whereas a few classes have as few as 200 sample images.
This class imbalance in the training dataset is likely to result in a biased model being learned, since the model simply sees images corresponding to some traffic-sign classes more often than others.
Sampling was used to resolve class imbalance and also to prevent overfitting.
The WeightedRandomSampler() was used to sample the images corresponding to all of the 43 classes with equal frequency (2,000 images per class, which increases the total number of input images too) and then they were passed to the PyTorch data loader.
The
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access