An optimizer is really nothing more than another way to train the backpropagation of error through a network. As we learned back in Chapter 1, Deep Learning for Games, the base algorithm we use for backpropagation is the gradient descent and the more advanced stochastic gradient descent (SGD).
SGD works by altering the evaluation of the gradient by randomly picking the batch order during each training iteration. While SGD works well for most cases, it does not perform well in a GAN, due to a problem known as the vanishing / exploding gradient, which happens when trying to train multiple, but combined, networks. Remember, we are directly feeding the results of our generator into the discriminator. Instead, we look to more advanced ...