The DQN we'll learn here is based on a DeepMind paper (https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf). At the heart of DQN is a deep convolutional neural network that takes as input the raw pixels of the game environment (just like any human player would see), captured one screen at a time, and as output, returns the value for each possible action. The action with the maximum value is the chosen action:
- The first step is to get all of the modules we'll need:
import gymimport sysimport randomimport numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltfrom datetime import datetimefrom scipy.misc import imresize
- We chose the Breakout game from the list of OpenAI ...