January 2020
Intermediate to advanced
432 pages
10h 18m
English
Dynamic programming (DP) was the second major thread to influence modern reinforcement learning (RL) after trial-and-error learning. In this chapter, we will look at the foundations of DP and explore how they influenced the field of RL. We will also look at how the Bellman equation and the concept of optimality have interwoven with RL. From there, we will look at policy and value iteration methods to solve a class of problems well suited for DP. Finally, we will look at how to use the concepts we have learned in this chapter to teach an agent to play the FrozenLake environment from OpenAI Gym.
Here are the main topics we will cover in this chapter:
Read now
Unlock full access