Applying DQN to a game

So far, we have randomly picked an action and applied it to the game. Now, let's apply DQN for selecting actions for playing the PacMan game.

  1. We define the q_nn policy function as follows:
def policy_q_nn(obs, env):    # Exploration strategy - Select a random action    if np.random.random() < explore_rate:        action = env.action_space.sample()    # Exploitation strategy - Select the action with the highest q    else:        action = np.argmax(q_nn.predict(np.array([obs])))    return action
  1. Next, we modify the episode function to incorporate calculation of q_values and train the neural network on the sampled experience buffer. This is shown in the following code:
def episode(env, policy, r_max=0, t_max=0): # create the empty list to contain ...

Get TensorFlow Machine Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.