9 More stable value-based methods

In this chapter

You will improve on the methods you learned in the previous chapter by making them more stable and therefore less prone to divergence.
You will explore advanced value-based deep reinforcement learning methods, and the many components that make value-based methods better.
You will solve the cart-pole environment in a fewer number of samples, and with more reliable and consistent results.

Let thy step be slow and steady, that thou stumble not.

— Tokugawa Ieyasu Founder and first shōgun of the Tokugawa shogunate of Japan and one of the three unifiers of Japan

In the last chapter, you learned about value-based deep reinforcement learning. NFQ, the algorithm we developed, is a simple solution ...

Get Grokking Deep Reinforcement Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Grokking Deep Reinforcement Learning by Miguel Morales