April 2018
Intermediate to advanced
334 pages
10h 18m
English
The reward of the state quantifies the usefulness of entering into a state. There are three different forms to represent the reward namely, R(s), R(s, a) and R(s, a, s'), but they are all equivalent.
For a particular environment, the domain knowledge plays an important role in the assignment of rewards for different states as minor changes in the reward do matter for finding the optimal solution to an MDP problem.
There are two approaches we reward our agent for when taking a certain action. They are: