April 2019
Intermediate to advanced
212 pages
5h 34m
English
The update function modifies the Q-values according to our familiar Bellman equation:

We use the alpha and gamma values we declared earlier, set the new Q-value for the current state based on the maximum Q-value for the next state, and fit the model to our new state and Q-value:
def update(self, state, action, reward, next_state, done): q_update = reward ...
Read now
Unlock full access