October 2019
Intermediate to advanced
366 pages
12h 4m
English
With the loss function and the optimization technique we just presented, you should be able to develop a deep Q-learning algorithm. However, the reality is much more subtle. Indeed, if we try to implement it, it probably won't work. Why? Once we introduce neural networks, we can no longer guarantee improvement. Although tabular Q-learning has convergence capabilities, its neural network counterpart does not.
Sutton and Barto in Reinforcement Learning: An Introduction, introduced a problem called the deadly triad, which arises when the following three factors are combined:
Read now
Unlock full access