The ∈-greedy strategy

We have already expanded the ideas behind the -greedy strategy and implemented it to help our exploration in algorithms such as Q-learning and DQN. It is a very simple approach, and yet it achieves very high performance in non-trivial jobs as well. This is the main reason behind its widespread use in many deep learning algorithms.

To refresh your memory, -greedy takes the best action most of the time, but from time to time, it selects a random action. The probability of choosing a random action is dictated by the value, ...

Get Reinforcement Learning Algorithms with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.