December 2018
Beginner to intermediate
682 pages
18h 1m
English
Markov decision process (MDP) formally describes an environment for reinforcement learning. Where:
Central idea of MDP: MDP works on the simple Markovian property of a state; for example, St+1 is entirely dependent on latest state St rather than any historic dependencies. In the following equation, the current state captures all the relevant information from the history, which means ...
Read now
Unlock full access