Chapter 5. Creating Vision Datasets

To carry out machine learning on images, we need images. Of the use cases we looked at in Chapter 4, the vast majority were for supervised machine learning. For such models, we also need the correct answer, or label, to train the ML model. If you are going to train an unsupervised ML model or a self-supervised model like a GAN or autoencoder, you can leave out the labels. In this chapter, we will look at how to create a machine learning dataset consisting of images and labels.

Tip

The code for this chapter is in the 05_create_dataset folder of the book’s GitHub repository. We will provide file names for code samples and notebooks where applicable.

Collecting Images

In most ML projects, the first stage is to collect the data. The data collection might be done in any number of ways: by mounting a camera at a traffic intersection, connecting to a digital catalog to obtain photographs of auto parts, purchasing an archive of satellite imagery, etc. It can be a logistical activity (mounting traffic cameras), a technical activity (building a software connector to the catalog database), or a commercial one (purchasing an image archive).

Get Practical Machine Learning for Computer Vision now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.