June 2018
Intermediate to advanced
436 pages
10h 33m
English
In reinforcement learning, a policy is a set of rules or a strategy. Therefore, one of the learning outcomes is to discover a good strategy that observes the long-term consequences of actions in each state. So, technically, a policy defines an action to be taken in a given state. The following diagram shows the optimal action given any state:

The short-term consequence is easy to calculate:It is just the reward. Although performing an action yields an immediate reward, it is not always a good idea to choose the action greedily with the best reward. There may be different types ...