Reinforcement Learning in Motion

Video description

We all learn by interacting with the world around us, constantly experimenting and interpreting the results. Reinforcement learning is a machine learning technique that follows this same explore-and-learn approach. Ideally suited to improve applications like automatic controls, simulations, and other adaptive systems, a RL algorithm takes in data from its environment and improves its accuracy based on the positive and negative outcomes of these interactions. This liveVideo course will get you started!

About the Technology

With reinforcement learning, an AI agent learns from its environment, constantly responding to the feedback it gets. The agent optimizes its behavior to avoid negative consequences and enhance positive outcomes. The resulting algorithms are always looking for the most positive and efficient outcomes!

Importantly, with reinforcement learning you don’t need a mountain of data to get started. You just let your AI agent poke and prod its environment, which makes it much easier to take on novel research projects without well-defined training datasets.



About the Video

Reinforcement Learning in Motion introduces you to the exciting world of machine systems that learn from their environments! Developer, data scientist, and expert instructor Phil Tabor guides you from the basics all the way to programming your own constantly-learning AI agents. In this course, he’ll break down key concepts like how RL systems learn, how to sense and process environmental data, and how to build and train AI agents. As you learn, you’ll master the core algorithms and get to grips with tools like Open AI Gym, numpy, and Matplotlib.

Reinforcement systems learn by doing, and so will you in this interactive, hands-on course! You’ll build and train a variety of algorithms as you go, each with a specific purpose in mind. The rich and interesting examples include simulations that train a robot to escape a maze, help a mountain car get up a steep hill, and balance a pole on a sliding cart. You’ll even teach your agents how to navigate Windy Gridworld, a standard exercise for finding the optimal path even with special conditions!



What's Inside
  • What is a reinforcement learning agent?
  • An introduction to the Open AI Gym
  • Identifying appropriate algorithms
  • Implementing RL algorithms using Numpy
  • Visualizing performance with Matplotlib


About the Reader
You’ll need to be familiar with Python and machine learning basics. Examples use Python libraries like NumPy and Matplotlib. You'll also need some understanding of linear algebra and calculus, please see the equations in the Free Downloads section for examples.

About the Author
Phil Tabor is a lifelong coder with a passion for simplifying and teaching complex topics. A physics PhD and former Intel process engineer, he works as a data scientist, teaches machine learning on YouTube, and contributes to Sensenet, an open source project using deep reinforcement learning to teach robots to identify objects by touch.

Quotes
A neat introduction to dive into Deep Reinforcement Learning.
- Sandeep Chigurupati

After watching the first few sections you'll be able to experiment with some simple algorithms and definitely want to continue learning more.
- Rob Pacheco

Gives a fantastic look into the examples and mathematical background.
- Harald Kuhn

It prepares you to apply reinforcement learning directly to a problem you have in hand!
- Yaser Marey

Table of contents

  1. INTRODUCTION TO REINFORCEMENT LEARNING
    1. Course introduction
    2. Getting Acquainted with Machine Learning
    3. How Reinforcement Learning Fits In
    4. Required software
  2. KEY CONCEPTS
    1. Understanding the agent
    2. Defining the environment
    3. Designing the reward
    4. How the agent learns
    5. Choosing actions
    6. Coding the environment
    7. Finishing the maze-running robot problem
  3. BEATING THE CASINO: THE EXPLORE-EXPLOIT DILEMMA
    1. Introducing the multi-armed bandit problem
    2. Action-value methods
    3. Coding the multi-armed bandit test bed
    4. Moving the goal posts: nonstationary problems
    5. Optimistic initial values and upper confidence bound action selection
    6. Wrapping up the explore-exploit dilemma
  4. SKATING THE FROZEN LAKE: MARKOV DECISION PROCESSES
    1. Introducing Markov decision processes and the frozen lake environment
    2. Even robots have goals
    3. Handling uncertainty with policies and value functions
    4. Achieving mastery: Optimal policies and value functions
    5. Skating off the frozen lake
  5. NAVIGATING GRIDWORLD WITH DYNAMIC PROGRAMMING
    1. Crash-landing on planet Gridworld
    2. Let's make a plan: Policy evaluation in Gridworld
    3. The best laid plans: Policy improvement in the Gridworld
    4. Hastening our escape with policy iteration
    5. Creating a backup plan with value iteration
    6. Wrapping up dynamic programming
  6. NAVIGATING THE WINDY GRIDWORLD WITH MONTE CARLO METHODS
    1. The windy gridworld problem
    2. Monte who?
    3. No substitute for action: Policy evaluation with Monte Carlo methods
    4. Monte Carlo control and exploring starts
    5. Monte Carlo control without exploring starts
    6. Off-policy Monte Carlo methods
    7. Return to the frozen lake and wrapping up Monte Carlo methods
  7. BALANCING THE CART POLE: TEMPORAL DIFFERENCE LEARNING
    1. The cart pole problem
    2. TD(0) prediction
    3. On-policy TD control: SARSA
    4. Off-policy TD control: Q learning
    5. Back to school with double learning
    6. Wrapping up temporal difference learning
  8. CLIMBING THE MOUNTAIN WITH APPROXIMATION METHODS
    1. The continuous mountain car problem
    2. Why approximation methods?
    3. Stochastic gradient descent: The intuition
    4. Stochastic gradient descent: The mathematics
    5. Approximate Monte Carlo predictions
    6. Linear methods and tiling
    7. TD(0) semi-gradient prediction
    8. Episodic semi-gradient control: SARSA
    9. Over the hill: wrapping up approximation methods and the mountain car problem
  9. SUMMARY
    1. Course recap
    2. The frontiers of reinforcement learning
    3. What to do next

Product information

  • Title: Reinforcement Learning in Motion
  • Author(s): phil tabor
  • Release date: January 2019
  • Publisher(s): Manning Publications
  • ISBN: 10000MNLV201807