O'Reilly logo

Python Reinforcement Learning Projects by Rajalingappaa Shanmugamani, Yang Wenzhuo, Sean Saito

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Implementation of DDPG

This section will show you how to implement the actor-critic architecture using TensorFlow. The code structure is almost the same as the DQN implementation that was shown in the previous chapter.

The ActorNetwork is a simple MLP that takes the observation state as its input:

class ActorNetwork:        def __init__(self, input_state, output_dim, hidden_layers, activation=tf.nn.relu):                self.x = input_state        self.output_dim = output_dim        self.hidden_layers = hidden_layers        self.activation = activation                with tf.variable_scope('actor_network'):            self.output = self._build()            self.vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,                                           tf.get_variable_scope().name)            def _build(self):                layer = self.x init_b = tf.constant_initializer(0.01) ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required