The train_controller method is called after we build the Controller. The first step is thus to initialize all the variables and the first state:
def train_controller(self): with self.graph.as_default(): self.sess.run(tf.global_variables_initializer()) step = 0 total_rewards = 0 child_network_architecture = np.array([[10.0, 128.0, 1.0, 1.0] * controller_params['max_layers']], dtype=np.float32)
The first child_network_architecture is a list that resembles an architecture configuration and will be the argument to NASCell, which would output the first child DNA.
The training procedure consists of two for loops: one for the number of epochs for the Controller, and another for each child network the Controller generates ...