So far, we have seen how to preprocess images and extract image metadata by linking them with the original images. Now, we need to extract features from those preprocessed images so that they can be fed into CNNs.
We need the map operations for feature extractions for business, data, and labels. These three operations will ensure that we don't lose any image provenance (see the script):
- Business mapping with the form imageID | businessID
- Data map of the form imageID | image data
- Label map of the form businessID | labels
First, we must define a regular expression pattern to extract a jpg name from the CSVImageMetadataReaderclass, which is used to match against training labels:
public static ...