May 2019
Intermediate to advanced
664 pages
15h 41m
English
The greedy algorithm in RL is a complete exploitation algorithm, which does not care for exploration. Greedy algorithms always select the action with the highest estimated action value. The action value is estimated according to past experience by averaging the rewards associated with the target action that have been observed so far.
However, use of a greedy algorithm can be a smart approach if we are able to successfully estimate the action value to the expected action value; if we know the true distribution, we can just select the best actions. An epsilon-greedy algorithm is a simple combination of the greedy and random approaches.
Epsilon helps to do this estimate. It adds exploration as part of the greedy ...