© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
M. HuThe Art of Reinforcement Learninghttps://doi.org/10.1007/978-1-4842-9606-6_9

9. Policy Gradient Methods

Michael Hu1  
(1)
Shanghai, Shanghai, China
 

In this chapter, we will explore policy-based methods, which is another category of family of reinforcement learning algorithms. In previous chapters, we focused on value-based methods, which estimate the optimal state-action value function. Value-based methods can become computationally expensive for large state or action spaces, and they can struggle with environments where the dynamics of the environment are stochastic.

Policy-based methods, on the other hand, directly learn the optimal policy without estimating ...

Get The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.