November 2019
Intermediate to advanced
296 pages
7h 52m
English
We assume that the simple environment has four global states. In every state, there are two possible actions. For simplicity, the transition can be settled deterministically, which means the next state is decided only by the current state and action taken by the agent. Here is the diagram illustrating the MDP we are discussing:

is the initial state, and
is the goal. There are two possible actions, , and
Read now
Unlock full access