M. HuThe Art of Reinforcement Learninghttps://doi.org/10.1007/978-1-4842-9606-6_11

11. Advanced Policy Gradient Methods

Michael Hu¹

(1)

Shanghai, Shanghai, China

One of the primary challenges associated with policy gradient methods is their instability and sensitivity to hyperparameters, such as the learning rate. This can lead to oscillations in the agent’s performance, resulting in slow convergence or even divergence. Furthermore, these methods often suffer from high variance in gradient estimates, which hampers convergence speed. Moreover, standard policy gradient methods exhibit poor sample efficiency, as they only use the generated sample transitions ...

Get The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python by Michael Hu

11. Advanced Policy Gradient Methods

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly