11 Policy-gradient and actor-critic methods
In this chapter
- You will learn about a family of deep reinforcement learning methods that can optimize their performance directly, without the need for value functions.
- You will learn how to use value function to make these algorithms even better.
- You will implement deep reinforcement learning algorithms that use multiple processes at once for very fast learning.
There is no better than adversity. Every defeat, every heartbreak, every loss, contains its own seed, its own lesson on how to improve your performance the next time.
— Malcolm X American Muslim minister and human rights activist
In this book, we’ve explored methods that can find optimal and near-optimal policies with the help of value ...
Get Grokking Deep Reinforcement Learning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.