Experiment results

In this section, we'll take a look at the results of our multi-step training process

The baseline agent

To train the agent, run Chapter17/01_a2c.py with the optional --cuda flag to enable GPU and required -n option with the experiment name used in TensorBoard and in a directory name to save models.

Chapter17$ ./01_a2c.py --cuda -n tt AtariA2C ( (conv): Sequential ( (0): Conv2d(2, 32, kernel_size=(8, 8), stride=(4, 4)) (1): ReLU () (2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2)) (3): ReLU () (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1)) (5): ReLU () ) (fc): Sequential ( (0): Linear (3136 -> 512) (1): ReLU () ) (policy): Linear (512 -> 4) (value): Linear (512 -> 1) ) 4: done 13 episodes, mean_reward=0.00, best_reward=0.00, ...

Get Deep Reinforcement Learning Hands-On now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.