Now, we will see how to train the network.
First, we define the DQN class and initialize all variables in the __init__ method:
class DQN(object): def __init__(self, state_size, action_size, session, summary_writer = None, exploration_period = 1000, minibatch_size = 32, discount_factor = 0.99, experience_replay_buffer = 10000, target_qnet_update_frequency = 10000, initial_exploration_epsilon = 1.0, final_exploration_epsilon = 0.05, reward_clipping = -1, ):
Initialize all variables:
self.state_size = state_size self.action_size = action_size self.session = session self.exploration_period = float(exploration_period) self.minibatch_size = minibatch_size self.discount_factor = tf.constant(discount_factor) self.experience_replay_buffer ...