As one of the most important equations in the entire field of reinforcement learning, the Bellman equation is the cornerstone of solving reinforcement learning problems. Developed by applied mathematician Richard Bellman, it's less of a equation and more of a condition of optimization that models the reward of an agent's decision at a point in time based on the expected choices and rewards that could come from said decision. The Bellman equation can be derived for either the state value function or the action value function:
As usual, let's break down these equations. We're going to focus on the state value function. ...