September 2018
Intermediate to advanced
288 pages
7h 38m
English
Deep Deterministic Policy Gradients (DDPG) is a policy-gradient algorithm that adopts a stochastic behavior policy for exploration. This algorithm estimates a deterministic target policy, which is much easier to learn. Policy-gradient algorithms use the following policy iteration:
DDPG is out of policy and uses a deterministic target policy, and for these reasons the use of the Deterministic Policy Gradient theorem is allowed. DDPG is also a critical algorithm-actor; it mainly uses two neural networks: one for the actor and one for the critic. These networks calculate the action forecasts for the current state and generate a ...
Read now
Unlock full access