June 2018
Intermediate to advanced
546 pages
13h 30m
English
In this chapter, we saw an alternative way of solving RL problems: PG, which is different in many ways from the familiar DQN method. We explored the basic method called REINFORCE, which is a generalization of our first method in RL-domain cross entropy. This method is simple, but, being applied to the Pong environment, didn’t show good results.
In the next chapter, we’ll consider ways to improve the stability of PG by combining both families of value-based and policy-based methods.
Read now
Unlock full access