Get full access to TensorFlow Machine Learning Projects and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Applying DQN to a game

So far, we have randomly picked an action and applied it to the game. Now, let's apply DQN for selecting actions for playing the PacMan game.

We define the q_nn policy function as follows:

def policy_q_nn(obs, env):    # Exploration strategy - Select a random action    if np.random.random() < explore_rate:        action = env.action_space.sample()    # Exploitation strategy - Select the action with the highest q    else:        action = np.argmax(q_nn.predict(np.array([obs])))    return action

Next, we modify the episode function to incorporate calculation of q_values and train the neural network on the sampled experience buffer. This is shown in the following code:

def episode(env, policy, r_max=0, t_max=0): # create the empty list to contain ...

Get TensorFlow Machine Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now