Cliff walking using RL
By now, you should be aware of the framework of RL. In this recipe, we will implement a real-world application of the gridworld environment in RL. This problem can be represented as a grid that's 4x12 in size. The episodes start in the lower-left state, with a goal state at the bottom right of the grid. Going left, right, up, and down are the only possible actions at any state. The states labeled C in the lower part of the grid are cliffs. Any transition into these states will incur a high negative reward of -100 and send the agent instantly back to the starting state, S. For the goal state, G, the reward is 0, while it's -1 for all the transitions except the goal state and cliff.
The following image shows the navigation ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access