January 2018
Beginner to intermediate
284 pages
8h 35m
English
The core idea behind experience replay is to store past experiences in memory and sample from it during the learning phase. More specifically, at every time step t, DQN stores an experience et of the form
. To reduce the correlation of sequence data, it samples an event from the event buffer using a uniform distribution as follows:
This allows the network to avoid overfitting due to data correlations. From an implementation perspective, this mini-batch based update can be massively parallel, leading to faster training time. ...
Read now
Unlock full access