The batch norm layer

Earlier, we took care of the initialization of our weights to make the job of gradient descent optimizers easier. However, the benefits are only seen during the early stages of training and do not guarantee improvements in the latter stages. That is where we turn to another great invention called the batch norm layer. The effect produced by using the batch norm layer in a CNN model is more or less the same as the input normalization that we saw in Chapter 2, Deep Learning and Convolutional Neural Networks; the only difference now is that this will happen at the output of all the convolution and fully connected layers in your model.

The batch norm layer will usually be attached to the end of every fully connected or convolution ...

Get Hands-On Convolutional Neural Networks with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.