Deep learning models are often known as black box models because it is very difficult to actually interpret how the model is working internally compared to simpler ML models, such as decision trees. We know that CNN-based deep learning models use convolution layers, which use filters to extract activation feature maps representing a spatial hierarchy of features. Conceptually, the top convolutional layers learn small local patterns, and the layers that are lower in the network learn more complex and larger patterns, which are obtained from the top convolution layers. Let's try to visualize this with an example.
We will take our best model (transfer learning with fine-tuning and image augmentation) and ...