I want to provide you with an end-to-end look at what we will be doing in the code for the rest of this chapter. Remember that we are building a convolutional neural network (CNN) that examines objects in a video frame and outputs if one or more toys are in the image, and where they are:
Here is an overview of the process:
- Prepare a training set of images of the room with and without toys.
- Label the area of the images that contain toys – in a separate folder.
- Label images that don't contain toys – in a separate folder.
- Break the training set into two parts: a set we use to train the network, and a set ...