Understanding policy gradient methods

One thing we need to understand about PG methods is why we need them and what the intuition is behind them. Then, we can cover some of the mathematics very briefly before diving into the code. So, let's cover the motivation behind using PG methods and what they hope to achieve beyond the other previous methods we have looked at. I have summarized the main points of why/what PG methods do and try to solve:

  • Deterministic versus stochastic functions: We often learn early in science and mathematics that many problems require a single or deterministic answer. In the real world, however, we often equate some amount of error to deterministic calculations to quantify their accuracy. This quantification of how ...

Get Hands-On Reinforcement Learning for Games now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.