How it works...
The HOG descriptors from the input image are computed and then fed into a pretrained SVM classifier (obtained using cv2's HOGDescriptor_getDefaultPeopleDetector()), which was used to predict the presence or absence of a person in or from an image block at multiple scales with the detectMultiScale() function from OpenCV-Python.
The object(s) were detected multiple times at different scales, and they were fused together using non-maximum suppression (additionally, we may see some false positives too).
The non_max_suppression() function was invoked to avoid the detection of the same object at multiple times and scales.
The IoU was used by non-maximum suppression to calculate the amount of overlap of two different bounding boxes, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access