October 2018
Beginner
362 pages
9h 32m
English
There is a natural extension of Deep Deterministic Policy Gradients (DDPG) by replacing the feedforward neural networks used for approximating the actor and the critic with recurrent neural networks. This extension is called the recurrent deterministic policy gradient algorithm (RDPG) and is discussed in the f paper N. Heess, J. J. Hunt, T. P. Lillicrap and D. Silver. Memory-based control with recurrent neural networks. 2015.
The recurrent critic and actor are trained using backpropagation through time (BPTT). For readers who are interested in it, the paper can be downloaded from https://arxiv.org/abs/1512.04455.
Read now
Unlock full access