April 2018
Intermediate to advanced
334 pages
10h 18m
English
As per the policy gradient theorem, for the previous specified policy objective functions and any differentiable policy
the policy gradient is as follows:

Steps to update parameters using the Monte Carlo policy gradient based approach is shown in the following section.