April 2018
Intermediate to advanced
334 pages
10h 18m
English
The first assumption is the infinite horizons, that is, the infinite amount of time steps to reach goal state from start state. Therefore,

The policy function doesn't take the remaining time steps into consideration. If it had been a finite horizon, then the policy would have been,
where t is the time steps left to get the task done.
Therefore, without the assumption of the infinite horizon, the notion of policy would not be stationary, that is, , rather it would be .