train_controller method

The train_controller method is called after we build the Controller. The first step is thus to initialize all the variables and the first state:

def train_controller(self):    with self.graph.as_default():        self.sess.run(tf.global_variables_initializer())    step = 0    total_rewards = 0    child_network_architecture = np.array([[10.0, 128.0, 1.0, 1.0] *                                           controller_params['max_layers']], dtype=np.float32)

The first child_network_architecture is a list that resembles an architecture configuration and will be the argument to NASCell, which would output the first child DNA.

The training procedure consists of two for loops: one for the number of epochs for the Controller, and another for each child network the Controller generates ...

Get Python Reinforcement Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.