5 Modern training techniques

This chapter covers

  • Improving long-term training using a learning rate schedule
  • Improving short-term training using optimizers
  • Combining learning rate schedules and optimizers to improve any deep model’s results
  • Tuning network hyperparameters with Optuna

At this point, we have learned the basics of neural networks and three types of architectures: fully connected, convolutional, and recurrent. These networks have been trained with an approach called stochastic gradient descent (SGD), which has been in use since at least the 1960s. Newer improvements to learning the parameters of our network have been invented since then, like momentum and learning rate decay, which can improve any neural network for any problem ...

Get Inside Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.