We develop the actor-critic algorithm in order to solve the CartPole environment as follows:
- Import all the necessary packages and create a CartPole instance:
>>> import gym>>> import torch>>> import torch.nn as nn>>> import torch.nn.functional as F>>> env = gym.make('CartPole-v0')
- Let's start with the actor-critic neural network model:
>>> class ActorCriticModel(nn.Module): ... def __init__(self, n_input, n_output, n_hidden): ... super(ActorCriticModel, self).__init__() ... self.fc = nn.Linear(n_input, n_hidden) ... self.action = nn.Linear(n_hidden, n_output) ... self.value = nn.Linear(n_hidden, 1) ... ... def forward(self, x): ... x = torch.Tensor(x) ... x = F.relu(self.fc(x)) ... action_probs = F.softmax(self.action(x), ...