October 2019
Intermediate to advanced
340 pages
8h 39m
English
We now develop double Q-learning to solve the Taxi environment as follows:
>>> import torch>>> import gym>>> env = gym.make('Taxi-v2')
>>> n_episode = 3000>>> length_episode = [0] * n_episode>>> total_reward_episode = [0] * n_episode
Here, we simulate 3,000 episodes as double Q-learning takes more episodes to converge.
Read now
Unlock full access