September 2018
Intermediate to advanced
296 pages
9h 10m
English
In the previous chapters, we discussed the DQN for playing Atari games and the use of the DPG and TRPO algorithms for continuous control tasks. Recall that DQN has the following architecture:

At each timestep
, the agent observes the frame image
and selects an action
based on the current learned policy. ...
Read now
Unlock full access