October 2019
Intermediate to advanced
366 pages
12h 4m
English
To consolidate the ideas behind policy iteration, we'll apply it to a game called FrozenLake. Here, the environment consists of a 4 x 4 grid. Using four actions that correspond to the directions (0 is left, 1 is down, 2 is right, and 3 is up), the agent has to move to the opposite side of the grid without falling in the holes. Moreover, movement is uncertain, and the agent has the possibility of movement in other directions. So, in such a situation, it could be beneficial not to move in the intended direction. A reward of +1 is assigned when the end goal is reached. The map of the game is shown in figure 3.4. S is the start position, the star is the end position, and the spirals are the holes:
Read now
Unlock full access