April 2018
Intermediate to advanced
334 pages
10h 18m
English
A single shot detector (SSD) is known for its balance between speed and accuracy. SSD just like YOLO runs a CNN on the input image only once to learn the representations. A small 3x3 convolution kernel is run on this representation to predict the bounding boxes and class probability. In order to handle the scale, SSD predicts bounding boxes after multiple convolutional layers. Since each convolutional layer operates at a different scale, it is able to detect objects of various scales.
The performance metric for Fast R-CNN, Faster R-CNN, YOLO, and SSD are shown in the following graph:
