In this section, we will start implementing our intelligent agent step-by-step. We will be implementing the famous Q-learning algorithm using the NumPy library and the MountainCar-V0 environment from the OpenAI Gym library.
Let's revisit the reinforcement learning Gym boiler plate code we used in Chapter 4, Exploring the Gym and its Features, as follows:
#!/usr/bin/env pythonimport gymenv = gym.make("Qbert-v0")MAX_NUM_EPISODES = 10MAX_STEPS_PER_EPISODE = 500for episode in range(MAX_NUM_EPISODES): obs = env.reset() for step in range(MAX_STEPS_PER_EPISODE): env.render() action = env.action_space.sample()# Sample random action. This will be replaced by our agent's action when we start developing the ...