9 More stable value-based methods

In this chapter

  • You will improve on the methods you learned in the previous chapter by making them more stable and therefore less prone to divergence.
  • You will explore advanced value-based deep reinforcement learning methods, and the many components that make value-based methods better.
  • You will solve the cart-pole environment in a fewer number of samples, and with more reliable and consistent results.

Let thy step be slow and steady, that thou stumble not.

— Tokugawa Ieyasu Founder and first shōgun of the Tokugawa shogunate of Japan and one of the three unifiers of Japan

In the last chapter, you learned about value-based deep reinforcement learning. NFQ, the algorithm we developed, is a simple solution ...

Get Grokking Deep Reinforcement Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.