In this section, we'll implement an agent that tries to play the cart pole game with the help of A2C. We'll do this with the familiar tools: the OpenAI Gym and TensorFlow. Recall that the state of the cart-pole environment is described by the position and angle of the cart and the pole. We'll use feedforward networks with one hidden layer for both the actor and the critic. Let's start!:
- First, we'll do the imports:
from collections import namedtupleimport gymimport matplotlib.pyplot as pltimport numpy as npimport tensorflow as tf
- Next, we'll create the environment:
env = gym.make('CartPole-v0')
- Then, we'll add some of the hyperparameters, which describe the training. We'll take the INPUT_SIZE and ACTIONS_COUNT ...