11 Policy-gradient and actor-critic methods

In this chapter

You will learn about a family of deep reinforcement learning methods that can optimize their performance directly, without the need for value functions.
You will learn how to use value function to make these algorithms even better.
You will implement deep reinforcement learning algorithms that use multiple processes at once for very fast learning.

There is no better than adversity. Every defeat, every heartbreak, every loss, contains its own seed, its own lesson on how to improve your performance the next time.

— Malcolm X American Muslim minister and human rights activist

In this book, we’ve explored methods that can find optimal and near-optimal policies with the help of value ...

Get Grokking Deep Reinforcement Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Grokking Deep Reinforcement Learning by Miguel Morales

11 Policy-gradient and actor-critic methods

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly