The Epsilon-Greedy is a widely used solution to the explore-exploit dilemma. Exploration is all about searching and exploring new options through experimentation and research to generate new values, while exploitation is all about refining existing options by repeating those options and improving their values.
The Epsilon-Greedy approach is very simple to understand and easy to implement:
epsilon() = 0.05 or 0.1 #any small value between 0 to 1#epsilon() is the probability of explorationp = random number between 0 and ...