book

Building Computer Vision Projects with OpenCV 4 and C++

by David Millan Escriva, Prateek Joshi, Vinicius G. Mendonca, Roy Shilkrot

March 2019

Intermediate to advanced

538 pages

13h 38m

English

Packt Publishing

Read now

Unlock full access

Content preview from Building Computer Vision Projects with OpenCV 4 and C++

YOLO v3 deep learning model architecture

Common object detection in classical computer vision uses a sliding window to detect objects, scanning a whole image with different window sizes and scales. The main problem here is the huge time consumption in scanning the image several times to find objects.

YOLO uses a different approach by dividing the diagram into an S x S grid. For each grid, YOLO checks for B bounding boxes, and then the deep learning model extracts the bounding boxes for each patch, the confidence to contain a possible object, and the confidence of each category in the training dataset per each box. The following screenshot shows the S x S grid:

YOLO is trained with a grid of 19 and 5 bounding boxes per grid using 80 categories. ...