Implementation of DDPG

This section will show you how to implement the actor-critic architecture using TensorFlow. The code structure is almost the same as the DQN implementation that was shown in the previous chapter.

The ActorNetwork is a simple MLP that takes the observation state as its input:

class ActorNetwork:        def __init__(self, input_state, output_dim, hidden_layers, activation=tf.nn.relu):                self.x = input_state        self.output_dim = output_dim        self.hidden_layers = hidden_layers        self.activation = activation                with tf.variable_scope('actor_network'):            self.output = self._build()            self.vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,                                           tf.get_variable_scope().name)            def _build(self):                layer = self.x init_b = tf.constant_initializer(0.01) ...

Get Python Reinforcement Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.