November 2019
Intermediate to advanced
304 pages
8h 40m
English
In step 1, we used FileSplit to filter the images based on the file type (PNG, JPEG, TIFF, and so on).
We also passed in a random number generator based on a single seed. This seed value is an integer (42 in our example). FileSplit will be able to generate a list of file paths in random order (random order of files) by making use of a random seed. This will introduce more randomness to the probabilistic decision and thereby increase the model's performance (accuracy metrics).
If you have a ready-made dataset with an unknown number of labels, it is crucial to calculate numLabels. Hence, we used FileSplit to calculate them programmatically:
int numLabels = fileSplit.getRootDir().listFiles(File::isDirectory).length;
In step ...