The Epsilon-Greedy approach

The Epsilon-Greedy is a widely used solution to the explore-exploit dilemma. Exploration is all about searching and exploring new options through experimentation and research to generate new values, while exploitation is all about refining existing options by repeating those options and improving their values.

The Epsilon-Greedy approach is very simple to understand and easy to implement:

epsilon() = 0.05 or 0.1 #any small value between 0 to 1#epsilon() is the probability of explorationp = random number between 0 and ...

Get Reinforcement Learning with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.