© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
M. HuThe Art of Reinforcement Learninghttps://doi.org/10.1007/978-1-4842-9606-6_11

11. Advanced Policy Gradient Methods

Michael Hu1  
Shanghai, Shanghai, China

One of the primary challenges associated with policy gradient methods is their instability and sensitivity to hyperparameters, such as the learning rate. This can lead to oscillations in the agent’s performance, resulting in slow convergence or even divergence. Furthermore, these methods often suffer from high variance in gradient estimates, which hampers convergence speed. Moreover, standard policy gradient methods exhibit poor sample efficiency, as they only use the generated sample transitions ...

Get The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.