When training the model, there are some common terms that indicate the different parts of the iterative optimization:
- An iteration defines one instance of calculating the error gradient and adjusting the model parameters. When the data is fed into groups of samples, each one of these groups is called a batch.
- Batches can include the whole dataset (traditional batching), or include just a tiny subset until the whole dataset is fed forward, called mini-batching. The number of samples per batch is called the batch size.
- Each pass of the whole dataset is called an epoch.