Chapter 2. Introduction to Computer Vision

While this book isn’t designed to teach you all of the fundamentals of architecting and training machine learning models, I do want to cover some basic scenarios so that the book can still work as a standalone. If you want to learn more about the model creation process with TensorFlow, I recommend my book, AI and Machine Learning for Coders,, published by O’Reilly, and if you want to go deeper than that, Aurelien Geron’s excellent book Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow (O’Reilly) is a must!

In this chapter, we’ll go beyond the very fundamental model you created in Chapter 1 and look at two more sophisticated ones, where you will deal with computer vision—namely how computers can “see” objects. Similar to the terms “artificial intelligence” and “machine learning,” the phrases “computer vision” and “seeing” might lead one to misunderstand what is fundamentally going on in the model.

Computer vision is a huge field, and for the purposes of this book and this chapter, we’ll focus narrowly on a couple of core scenarios, where we will use technology to parse the contents of images, either labeling the primary content of an image, or finding items within an image.

It’s not really about “vision” or “seeing,” but more having a structured algorithm that allows a computer to parse the pixels of an image. It doesn’t “understand” the image any more than it understands the meaning of a sentence when it parses the words ...

Get AI and Machine Learning for On-Device Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.