April 2018
Intermediate to advanced
334 pages
10h 18m
English
We have already discussed the pong environment before in Chapter 4, Policy Gradients. We will use the following code to create the A3C for Pong-v0 in OpenAI gym:
import multiprocessingimport threadingimport tensorflow as tfimport numpy as npimport gymimport osimport shutilimport matplotlib.pyplot as pltgame_env = 'Pong-v0'num_workers = multiprocessing.cpu_count()max_global_episodes = 100000global_network_scope = 'globalnet'global_iteration_update = 20gamma = 0.9beta = 0.0001lr_actor = 0.0001 # learning rate for actorlr_critic = 0.0001 # learning rate for criticglobal_running_rate = []global_episode = 0env = gym.make(game_env)num_actions = env.action_space.ntf.reset_default_graph()
The input state image preprocessing ...
Read now
Unlock full access