June 2018
Intermediate to advanced
546 pages
13h 30m
English
In the previous section, we had Pong solved in three hours of optimization and 9M frames. Now it's a good time to tweak our hyperparameters to speed up convergence. The golden rule here is to tweak one option at a time and make conclusions carefully, as the whole process is stochastic.
In this section, we'll start with the original hyperparameters and perform the following experiments:
Strictly speaking, the experiments below weren't proper hyperparameter tuning, just an attempt to get a better understanding of how A2C convergence dynamics depend on the parameters. To ...
Read now
Unlock full access