August 2019
Intermediate to advanced
242 pages
5h 45m
English
As our dataset is much larger now, we need to also think about the practicalities of training it. Performing training on an item-by-item basis is fine, but we can train items in batches as well. Instead of training on all 60,000 items in MNIST, we can split up our data into 600 iterations, with batches of 100 items each. For our dataset, this means feeding our model 100 x 784 matrices as input instead of a 784-value-long vector. We could also feed it a three-dimensional tensor of 100 x 28 x 28, but we'll do that in a later chapter when we cover a model architecture that makes good use of this structure.
Since we are doing this in a programming language, we can just build a loop as follows:
for b := 0; b ...
Read now
Unlock full access