January 2020
Intermediate to advanced
432 pages
10h 18m
English
The agent code now plays or explores the environment and it is helpful if we understand how this code runs. Open up Chapter_3_3.py again and follow the exercise:
def play_game(env, policy, display=True): env.reset() episode = [] finished = False while not finished: s = env.env.s if display: clear_output(True) env.render() sleep(1) timestep = [] timestep.append(s) n = random.uniform(0, sum(policy[s].values())) top_range = 0 action = 0 for prob in policy[s].items(): top_range += prob[1] if n < top_range: action = prob[0] break state, reward, finished, info = env.step(action) ...
Read now
Unlock full access