Chapter 3. Image Analysis and Recognition on the Cloud

The entire landscape of machine learning changed in 2012 when Dr. Geoffrey Hinton from the University of Toronto supervised a team to victory in the ImageNet challenge. Hinton did this with a sleeper machine learning technology called deep learning, paired with his own enhancement called a convolutional neural network, or CNN. His CNNs would go on to beat what was considered standard at the time on the ImageNet challenge by several percent, a feat not even thought possible. Since then, deep learning and CNNs have been used to surpass human ability in image recognition and other image- or vision-related tasks. However, these models are still not trivial to build and often require substantial experience, knowledge, and computing power. Fortunately, Google is able to provide a number of pretrained Vision AI services that we can use to our great benefit, as we will learn in this chapter.

In this chapter, we will learn how deep learning is now able to classify images or perform other vision-related tasks better than humans. Then we will construct a simple image classifier to see how the underlying technology works to do object recognition and then construct the Google Vision AI Building Block, where we will explore the API and perform various tasks like product search. We will finish off the chapter by looking at the AutoML Vision service and how it can be used to automatically build vision models.

The following is a summary of ...

Get Practical AI on the Google Cloud Platform now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.