O'Reilly logo

Deep Reinforcement Learning Hands-On by Maxim Lapan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Experiment results

In this section, we'll take a look at the results of our multi-step training process

The baseline agent

To train the agent, run Chapter17/01_a2c.py with the optional --cuda flag to enable GPU and required -n option with the experiment name used in TensorBoard and in a directory name to save models.

Chapter17$ ./01_a2c.py --cuda -n tt AtariA2C ( (conv): Sequential ( (0): Conv2d(2, 32, kernel_size=(8, 8), stride=(4, 4)) (1): ReLU () (2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2)) (3): ReLU () (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1)) (5): ReLU () ) (fc): Sequential ( (0): Linear (3136 -> 512) (1): ReLU () ) (policy): Linear (512 -> 4) (value): Linear (512 -> 1) ) 4: done 13 episodes, mean_reward=0.00, best_reward=0.00, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required