Q-Learning in Python
The environment and the Q-Learning discussed in the previous section can be implemented in Python. Since the policy is just a simple table, there is, at this point in time no need for Keras. Listing 9.3.1 shows q-learning-9.3.1.py
, the implementation of the simple deterministic world (environment, agent, action, and Q-Table algorithms) using the QWorld
class. For conciseness, the functions dealing with the user interface are not shown.
In this example, the environment dynamics is represented by self.transition_table
. At every action, self.transition_table
determines the next state. The reward for executing an action is stored in self.reward_table
. The two tables are consulted every time an action is executed by the step()
Get Advanced Deep Learning with Keras now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.