April 2017
Intermediate to advanced
320 pages
7h 46m
English
The following is the implementation of the Q-learning algorithm for the FrozenLake-v0 problem:
import gym import numpy as np env = gym.make('FrozenLake-v0') #Initialize table with all zeros Q = np.zeros([env.observation_space.n,env.action_space.n]) # Set learning parameters lr = .85 gamma = .99 num_episodes = 2000 #create lists to contain total rewards and steps per episode rList = [] for i in range(num_episodes): #Reset environment and get first new observation s = env.reset() rAll = 0 d = False j = 0 #The Q-Table learning algorithm while j < 99: j+=1 #Choose an action by greedily (with noise) picking from Q table a=np.argmax(Q[s,:]+\ np.random.randn(1,env.action_space.n)*(1./(i+1))) ...Read now
Unlock full access