O'Reilly logo

Learn OpenCV 4 by Building Projects - Second Edition by Prateek Joshi, Vinicius G. Mendonca, David Millan Escriva

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

YOLO v3 deep learning model architecture

Common object detection in classical computer vision uses a sliding window to detect objects, scanning a whole image with different window sizes and scales. The main problem here is the huge time consumption in scanning the image several times to find objects.

YOLO uses a different approach by dividing the diagram into an S x S grid. For each grid, YOLO checks for B bounding boxes, and then the deep learning model extracts the bounding boxes for each patch, the confidence to contain a possible object, and the confidence of each category in the training dataset per each box. The following screenshot shows the S x S grid:

YOLO is trained with a grid of 19 and 5 bounding boxes per grid using 80 categories. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required