October 2019
Intermediate to advanced
366 pages
12h 4m
English
So far, we've addressed and developed value-based reinforcement learning algorithms. These algorithms learn a value function in order to be able to find a good policy. Despite the fact that they exhibit good performances, their application is constrained by some limits that are embedded in their inner workings. In this chapter, we'll introduce a new class of algorithms called policy gradient methods, which are used to overcome the constraints of value-based methods by approaching the RL problem from a different perspective.
Policy gradient methods select an action based on a learned parametrized policy, instead of relying on a value function. In this chapter, we will also elaborate on the theory and ...
Read now
Unlock full access