October 2019
Intermediate to advanced
366 pages
12h 4m
English
The deep Q-network is trained by minimizing the loss function (5.2) that we have already presented, but with the further employment of a separate Q-target network,
, with a weight,
, putting everything together, the loss function becomes:

Here,
is the parameters of the online network.
The optimization of the differentiable loss ...
Read now
Unlock full access