January 2020
Intermediate to advanced
432 pages
10h 18m
English
AC methods use a combination of networks to predict the output of the value and policy functions, where our value function network resembles DQN and our policy function is defined using a PG method such as REINFORCE. Now, for the most part, this is as simple as it sounds; however, there are several details in the way in which we code these implementations that require some attention. We will, therefore, cover the details of this implementation as we review the code. Open Chapter_8_ActorCritic.py and follow the next exercise:
Read now
Unlock full access