This chapter lays out basic reinforcement learning theory. It introduces the notation used in reinforcement learning literature and provides detailed explanation and proofs of underlying concepts. It provides the foundation for reinforcement learning algorithms introduced in the next chapter.
Richard Bellman pioneered the development of reinforcement learning in the 1950s (Dreyfus, 2002) with the formulation of the Bellman equation governing the optimal state-action selection in a Markov decision problem ...