Let's discuss now how to optimize a policy. In policy methods, our main objective is that a given policy with parameter vector finds the best values of the parameter vector. In order to measure which is the best, we measure the quality of the policy for different values of the parameter vector .
Before discussing the ...