August 2018
Intermediate to advanced
272 pages
7h 2m
English
The idea of having the whole dataset in memory to train networks, as the example in Chapter 1, Setup and Introduction to TensorFlow, is intractable for large datasets. What people do in practice is, during training, they divide the dataset into small pieces, named mini batches (or commonly just batches). Then, in turn, each mini batch is loaded and fed to the network where the backpropagation and gradient descent algorithms will be calculated and weights then updated. This is then repeated for each mini batch until you have gone through the dataset completely.
The gradient calculated for a mini-batch is a noisy estimate of the true gradient of the whole training set, but by repeatedly getting these small noisy updates, we will still ...