January 2020
Intermediate to advanced
432 pages
10h 18m
English
Bellman worked on solving finite MDP with DP, and it was during these efforts he derived his famed equation. The beauty behind this equation—and more abstractly, the concept, in general—is that it describes a method of optimizing the value or quality of a state. In other words, it describes how we can determine the optimal value/quality for being in a given state given the action and choices of successive states. Before breaking down the equation itself, let's first reconsider the finite MDP in the next section.
Read now
Unlock full access