October 2017
Beginner to intermediate
270 pages
7h
English
One of the most well-known reinforcement learning techniques, and the one we will be implementing in our example, is Q-learning.
Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function that represents the maximum discounted future reward when we perform action a in state s.
Once we know the Q-function, the optimal action a in state s is the one with the highest Q-value. We can then define a policy π(s), that gives us the optimal action in any state, expressed as follows:

We can define the Q-function for ...
Read now
Unlock full access