October 2019
Intermediate to advanced
366 pages
12h 4m
English
From an implementation perspective, the only change to make in order to implement DDQN is in the training phase. You just need to replace the following lines of code in the DDQN implementation itself:
mb_trg_qv = sess.run(target_qv, feed_dict={obs_ph:mb_obs2})y_r = q_target_values(mb_rew, mb_done, mb_trg_qv, discount)
mb_onl_qv, mb_trg_qv = sess.run([online_qv,target_qv], feed_dict={obs_ph:mb_obs2})y_r = double_q_target_values(mb_rew, mb_done, mb_trg_qv, mb_onl_qv, discount)
Here, double_q_target_values is a function that computes (5.7) for each transition of the minibatch.
Read now
Unlock full access