We'll code the strategy we discussed earlier as follows (the code file is available as Deep_Q_learning_to_balance_a_cart_pole.ipynb in GitHub):
- Create the environment and store the action size and state size in variables:
import gym env = gym.make('CartPole-v0') state_size = env.observation_space.shape[0] action_size = env.action_space.n
A cart-pole environment looks as follows:
- Import the relevant packages:
import numpy as npimport randomfrom keras.models import Sequentialfrom keras.layers import Densefrom keras.optimizers import Adamfrom collections import deque
- Define a model:
model=Sequential()model.add(Dense(24,input_dim=state_size,activation='relu')) ...