train_controller method

The train_controller method is called after we build the Controller. The first step is thus to initialize all the variables and the first state:

def train_controller(self):    with self.graph.as_default():        self.sess.run(tf.global_variables_initializer())    step = 0    total_rewards = 0    child_network_architecture = np.array([[10.0, 128.0, 1.0, 1.0] *                                           controller_params['max_layers']], dtype=np.float32)

The first child_network_architecture is a list that resembles an architecture configuration and will be the argument to NASCell, which would output the first child DNA.

The training procedure consists of two for loops: one for the number of epochs for the Controller, and another for each child network the Controller generates ...

Get Python Reinforcement Learning Projects now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.