O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hands-On Intelligent Agents with OpenAI Gym

Book Description

Implement intelligent agents using PyTorch to solve classic AI problems, play console games like Atari, and perform tasks such as autonomous driving using the CARLA driving simulator

Key Features

  • Explore the OpenAI Gym toolkit and interface to use over 700 learning tasks
  • Implement agents to solve simple to complex AI problems
  • Study learning environments and discover how to create your own

Book Description

Many real-world problems can be broken down into tasks that require a series of decisions to be made or actions to be taken. The ability to solve such tasks without a machine being programmed requires a machine to be artificially intelligent and capable of learning to adapt. This book is an easy-to-follow guide to implementing learning algorithms for machine software agents in order to solve discrete or continuous sequential decision making and control tasks.

Hands-On Intelligent Agents with OpenAI Gym takes you through the process of building intelligent agent algorithms using deep reinforcement learning starting from the implementation of the building blocks for configuring, training, logging, visualizing, testing, and monitoring the agent. You will walk through the process of building intelligent agents from scratch to perform a variety of tasks. In the closing chapters, the book provides an overview of the latest learning environments and learning algorithms, along with pointers to more resources that will help you take your deep reinforcement learning skills to the next level.

What you will learn

  • Explore intelligent agents and learning environments
  • Understand the basics of RL and deep RL
  • Get started with OpenAI Gym and PyTorch for deep reinforcement learning
  • Discover deep Q learning agents to solve discrete optimal control tasks
  • Create custom learning environments for real-world problems
  • Apply a deep actor-critic agent to drive a car autonomously in CARLA
  • Use the latest learning environments and algorithms to upgrade your intelligent agent development skills

Who this book is for

If you're a student, game/machine learning developer, or AI enthusiast looking to get started with building intelligent agents and algorithms to solve a variety of problems with the OpenAI Gym interface, this book is for you. You will also find this book useful if you want to learn how to build deep reinforcement learning-based agents to solve problems in your domain of interest. Though the book covers all the basic concepts that you need to know, some working knowledge of Python programming language will help you get the most out of it.

Downloading the example code for this book You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Hands-On Intelligent Agents with OpenAI Gym
  3. Dedication
  4. Packt Upsell
    1. Why subscribe?
    2. PacktPub.com
  5. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  7. Introduction to Intelligent Agents and Learning Environments
    1. What is an intelligent agent?
    2. Learning environments
    3. What is OpenAI Gym?
    4. Understanding the features of OpenAI Gym
      1. Simple environment interface
      2. Comparability and reproducibility
      3. Ability to monitor progress
    5. What can you do with the OpenAI Gym toolkit?
    6. Creating your first OpenAI Gym environment
      1. Creating and visualizing a new Gym environment
    7. Summary
  8. Reinforcement Learning and Deep Reinforcement Learning
    1. What is reinforcement learning?
    2. Understanding what AI means and what's in it in an intuitive way
      1. Supervised learning
      2. Unsupervised learning
      3. Reinforcement learning
    3. Practical reinforcement learning
      1. Agent
      2. Rewards
      3. Environment
      4. State
      5. Model
      6. Value function
        1. State-value function
        2. Action-value function
      7. Policy
    4. Markov Decision Process
    5. Planning with dynamic programming
    6. Monte Carlo learning and temporal difference learning
    7. SARSA and Q-learning
    8. Deep reinforcement learning
    9. Practical applications of reinforcement and deep reinforcement learning algorithms
    10. Summary
  9. Getting Started with OpenAI Gym and Deep Reinforcement Learning
    1. Code repository, setup, and configuration
      1. Prerequisites
      2. Creating the conda environment
      3. Minimal install – the quick and easy way
      4. Complete install of OpenAI Gym learning environments
        1. Instructions for Ubuntu 
        2. Instructions for macOS
        3. MuJoCo installation
        4. Completing the OpenAI Gym setup
    2. Installing tools and libraries needed for deep reinforcement learning
      1. Installing prerequisite system packages
      2. Installing Compute Unified Device Architecture (CUDA)
      3. Installing PyTorch
    3. Summary
  10. Exploring the Gym and its Features
    1. Exploring the list of environments and nomenclature
      1. Nomenclature
      2. Exploring the Gym environments
    2. Understanding the Gym interface
    3. Spaces in the Gym
    4. Summary
  11. Implementing your First Learning Agent - Solving the Mountain Car problem
    1. Understanding the Mountain Car problem
      1. The Mountain Car problem and environment
    2. Implementing a Q-learning agent from scratch
      1. Revisiting Q-learning
      2. Implementing a Q-learning agent using Python and NumPy
        1. Defining the hyperparameters
        2. Implementing the Q_Learner class's __init__ method
        3. Implementing the Q_Learner class's discretize method
        4. Implementing the Q_Learner's get_action method
        5. Implementing the Q_learner class's learn method
        6. Full Q_Learner class implementation
    3. Training the reinforcement learning agent at the Gym
    4. Testing and recording the performance of the agent
    5. A simple and complete Q-Learner implementation for solving the Mountain Car problem
    6. Summary
  12. Implementing an Intelligent Agent for Optimal Control using Deep Q-Learning
    1. Improving the Q-learning agent
      1. Using neural networks to approximate Q-functions
        1. Implementing a shallow Q-network using PyTorch 
          1. Implementing the Shallow_Q_Learner
          2. Solving the Cart Pole problem using a Shallow Q-Network
      2. Experience replay 
        1. Implementing the experience memory
        2. Implementing the replay experience method for the Q-learner class
      3. Revisiting the epsilon-greedy action policy
        1. Implementing an epsilon decay schedule
    2. Implementing a deep Q-learning agent
      1. Implementing a deep convolutional Q-network in PyTorch
      2. Using the target Q-network to stabilize an agent's learning
      3. Logging and visualizing an agent's learning process
        1. Using TensorBoard for logging and visualizing a PyTorch RL agent's progress
      4. Managing hyperparameters and configuration parameters
        1. Using a JSON file to easily configure parameters
        2. The parameters manager
      5. A complete deep Q-learner to solve complex problems with raw pixel input
    3. The Atari Gym environment
      1. Customizing the Atari Gym environment
        1. Implementing custom Gym environment wrappers
          1. Reward clipping
          2. Preprocessing Atari screen image frames
          3. Normalizing observations
          4. Random no-ops on reset
          5. Fire on reset
          6. Episodic life
          7. Max and skip-frame
        2. Wrapping the Gym environment
    4. Training the deep Q-learner to play Atari games
      1. Putting together a comprehensive deep Q-learner
      2. Hyperparameters
      3. Launching the training process
      4. Testing performance of your deep Q-learner in Atari games
    5. Summary
  13. Creating Custom OpenAI Gym Environments - CARLA Driving Simulator
    1. Understanding the anatomy of Gym environments
      1. Creating a template for custom Gym environment implementations
      2. Registering custom environments with OpenAI Gym
    2. Creating an OpenAI Gym-compatible CARLA driving simulator environment
      1. Configuration and initialization
        1. Configuration
        2. Initialization
      2. Implementing the reset method
        1. Customizing the CARLA simulation using the CarlaSettings object
          1. Adding cameras and sensors to a vehicle in CARLA
      3. Implementing the step function for the CARLA environment
        1. Accessing camera or sensor data
        2. Sending actions to control agents in CARLA
          1. Continuous action space in CARLA
          2. Discrete action space in CARLA
          3. Sending actions to the CARLA simulation server
        3. Determining the end of episodes in the CARLA environment
      4. Testing the CARLA Gym environment
    3. Summary
  14. Implementing an Intelligent - Autonomous Car Driving Agent using Deep Actor-Critic Algorithm
    1. The deep n-step advantage actor-critic algorithm
      1. Policy gradients
        1. The likelihood ratio trick
        2. The policy gradient theorem
      2. Actor-critic algorithm
      3. Advantage actor-critic algorithm
      4. n-step advantage actor-critic algorithm
        1. n-step returns
        2. Implementing the n-step return calculation
      5. Deep n-step advantage actor-critic algorithm
    2. Implementing a deep n-step advantage actor critic agent
      1. Initializing the actor and critic networks
      2. Gathering n-step experiences using the current policy
      3. Calculating the actor's and critic's losses
      4. Updating the actor-critic model
      5. Tools to save/load, log, visualize, and monitor
      6. An extension - asynchronous deep n-step advantage actor-critic 
    3. Training an intelligent and autonomous driving agent
      1. Training and testing the deep n-step advantage actor-critic agent
      2. Training the agent to drive a car in the CARLA driving simulator
    4. Summary
  15. Exploring the Learning Environment Landscape - Roboschool, Gym-Retro, StarCraft-II, DeepMindLab
    1. Gym interface-compatible environments
      1. Roboschool
        1. Quickstart guide to setting up and running Roboschool environments
      2. Gym retro
        1. Quickstart guide to setup and run Gym Retro
    2. Other open source Python-based learning environments
      1. StarCraft II - PySC2
        1. Quick start guide to setup and run StarCraft II PySC2 environment
          1. Downloading the StarCraft II Linux packages
          2. Downloading the SC2 maps
          3. Installing PySC2
          4. Playing StarCraftII yourself or running sample agents
      2. DeepMind lab
        1. DeepMind Lab learning environment interface
          1. reset(episode=-1, seed=None)
          2. step(action, num_steps=1)
          3. observations()
          4. is_running()
          5. observation_spec()
          6. action_spec()
          7. num_steps()
          8. fps()
          9. events()
          10. close()
        2. Quick start guide to setup and run DeepMind Lab
          1. Setting up and installing DeepMind Lab and its dependencies
          2. Playing the game, testing a randomly acting agent, or training your own!
    3. Summary
  16. Exploring the Learning Algorithm Landscape - DDPG (Actor-Critic), PPO (Policy-Gradient), Rainbow (Value-Based)
    1. Deep Deterministic Policy Gradients
      1. Core concepts
    2. Proximal Policy Optimization
      1. Core concept
        1. Off-policy learning
        2. On-policy
    3. Rainbow 
      1. Core concept
        1. DQN
        2. Double Q-Learning
        3. Prioritized experience replay
        4. Dueling networks
        5. Multi-step learning/n-step learning
        6. Distributional RL
        7. Noisy nets
      2. Quick summary of advantages and applications
    4. Summary
  17. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think