November 2019
Intermediate to advanced
296 pages
7h 52m
English
As the calculation of expected value is a linear operation, the action-value function is rewritten as follows:

Although we omit the detail of the calculation here, the Bellman equation regarding the action-value function is derived from the following calculation:

This expression indicates that we can compute the action value of every pair of state and action recursively. This makes things much simpler. However, it does not resolve the essential problem because the Bellman equation still depends on the transition function, ...
Read now
Unlock full access