- The code for this recipe is based on the Andrej Karpathy blog (http://karpathy.github.io/2016/05/31/rl/) and a part of it has been adapted from code by Sam Greydanus (https://gist.github.com/karpathy/a4166c7fe253700972fcbc77e4ea32c5).
- We have the usual imports:
import numpy as npimport gymimport matplotlib.pyplot as pltimport tensorflow as tf
- We define our PolicyNetwork class. During the class construction, the model hyperparameters are also initialized. The __init__ method defines the placeholders for input state, self.tf_x; predicted action, self.tf.y; corresponding reward, self.tf_epr; network weights; and ops to predict action value, training, and updating. You can see that the class construction also initiates an interactive ...