Chapter 11. Advanced Vision Problems

So far in this book, we have looked primarily at the problem of classifying an entire image. In Chapter 2 we touched on image regression, and in Chapter 4 we discussed object detection and image segmentation. In this chapter, we will look at more advanced problems that can be solved using computer vision: measurement, counting, pose estimation, and image search.


The code for this chapter is in the 11_adv_problems folder of the book’s GitHub repository. We will provide file names for code samples and notebooks where applicable.

Object Measurement

Sometimes we want to know the measurements of an object within an image (e.g., that a sofa is 180 cm long). While we can simply use pixel-wise regression to measure something like ground precipitation using aerial images of cloud cover, we will need to do something more sophisticated for the object measurement scenario. We can’t simply count the number of pixels and infer a size from that, because the same object could be represented by a different number of pixels due to where it is within the image, its rotation, aspect ratio, etc. Let’s walk through the four steps needed to measure an object from a photograph of it, following an approach suggested by Imaginea Labs.

Reference Object

Suppose we’re an online shoe store, and we want to help customers find the best shoe size by using photographs of their footprints. We ask customers to get their feet wet and step onto a paper material, then upload ...

Get Practical Machine Learning for Computer Vision now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.