In the previous subsection, we have seen that the Naïve sliding window-based approach has severe performance drawbacks since this type of approach is not able to reuse many of the values already computed.
Nevertheless, when each individual window moves, we need to execute millions of hyperparameters for all pixels in order to get a prediction. In reality, most of the computation could be reused by introducing convolution (refer to Chapter 5, Image Classification using Transfer Learning, to get to know more on transfer learning using pre-trained DCNN architecture for image classification). This can be achieved in two incremental ways:
- By turning full-connected CNN layers into convolution
- Using CSW
We have ...