The VGG network is a derivation of AlexNet that was created by Andrew Zisserman and Karen Simonyan at the Visual Geometry Group (VGG) at the University of Oxford in 2015. This architecture is simpler than the one we saw earlier, but it gives us a much better framework to work with. VGGNet was also trained on the ImageNet dataset, except it takes images with a size of 224 × 224 × 3 that are sampled from the rescaled images in the dataset as input. You may have noticed that we have headed this section VGG-16—this is because the VGG network has 16 layers. There are variants of this architecture that have 11, 13, and 19 layers.
We will first explore the basic building blocks of the network, known as VGG blocks. These blocks are made up ...