M. HuThe Art of Reinforcement Learninghttps://doi.org/10.1007/978-1-4842-9606-6_9

9. Policy Gradient Methods

Michael Hu¹

(1)

Shanghai, Shanghai, China

In this chapter, we will explore policy-based methods, which is another category of family of reinforcement learning algorithms. In previous chapters, we focused on value-based methods, which estimate the optimal state-action value function. Value-based methods can become computationally expensive for large state or action spaces, and they can struggle with environments where the dynamics of the environment are stochastic.

Policy-based methods, on the other hand, directly learn the optimal policy without estimating ...

Get The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python by Michael Hu

9. Policy Gradient Methods

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly