Q-Learning in Python

The environment and the Q-Learning discussed in the previous section can be implemented in Python. Since the policy is just a simple table, there is, at this point in time no need for Keras. Listing 9.3.1 shows q-learning-9.3.1.py, the implementation of the simple deterministic world (environment, agent, action, and Q-Table algorithms) using the QWorld class. For conciseness, the functions dealing with the user interface are not shown.

In this example, the environment dynamics is represented by self.transition_table. At every action, self.transition_table determines the next state. The reward for executing an action is stored in self.reward_table. The two tables are consulted every time an action is executed by the step()

Get Advanced Deep Learning with Keras now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.