Video DescriptionA neat introduction to dive into Deep Reinforcement Learning.
Reinforcement Learning in Motion introduces you to the exciting world of machine systems that learn from their environments! Developer, data scientist, and expert instructor Phil Tabor guides you from the basics all the way to programming your own constantly-learning AI agents. In this course, he’ll break down key concepts like how RL systems learn, how to sense and process environmental data, and how to build and train AI agents. As you learn, you’ll master the core algorithms and get to grips with tools like Open AI Gym, numpy, and Matplotlib.
Reinforcement systems learn by doing, and so will you in this hands-on course! You’ll build and train a variety of algorithms as you go, each with a specific purpose in mind. The rich and interesting examples include simulations that train a robot to escape a maze, help a mountain car get up a steep hill, and balance a pole on a sliding cart. You’ll even teach your agents how to navigate Windy Gridworld, a standard exercise for finding the optimal path even with special conditions!
With reinforcement learning, an AI agent learns from its environment, constantly responding to the feedback it gets. The agent optimizes its behavior to avoid negative consequences and enhance positive outcomes. The resulting algorithms are always looking for the most positive and efficient outcomes!
Importantly, with reinforcement learning you don’t need a mountain of data to get started. You just let your AI agent poke and prod its environment, which makes it much easier to take on novel research projects without well-defined training datasets.
- What is a reinforcement learning agent?
- An introduction to the Open AI Gym
- Identifying appropriate algorithms
- Implementing RL algorithms using Numpy
- Visualizing performance with Matplotlib
Phil Tabor is a lifelong coder with a passion for simplifying and teaching complex topics. A physics PhD and former Intel process engineer, he works as a data scientist, teaches machine learning on YouTube, and contributes to Sensenet, an open source project using deep reinforcement learning to teach robots to identify objects by touch.
After watching the first few sections you'll be able to experiment with some simple algorithms and definitely want to continue learning more.
Gives a fantastic look into the examples and mathematical background.
It prepares you to apply reinforcement learning directly to a problem you have in hand!
Table of Contents
- INTRODUCTION TO REINFORCEMENT LEARNING
- KEY CONCEPTS
BEATING THE CASINO: THE EXPLORE-EXPLOIT DILEMMA
- Introducing the multi-armed bandit problem 00:03:47
- Action-value methods 00:06:43
- Coding the multi-armed bandit test bed 00:06:55
- Moving the goal posts: nonstationary problems 00:07:08
- Optimistic initial values and upper confidence bound action selection 00:11:51
- Wrapping up the explore-exploit dilemma 00:04:51
- SKATING THE FROZEN LAKE: MARKOV DECISION PROCESSES
NAVIGATING GRIDWORLD WITH DYNAMIC PROGRAMMING
- Crash-landing on planet Gridworld 00:09:42
- Let's make a plan: Policy evaluation in Gridworld 00:08:18
- The best laid plans: Policy improvement in the Gridworld 00:03:57
- Hastening our escape with policy iteration 00:04:57
- Creating a backup plan with value iteration 00:06:09
- Wrapping up dynamic programming 00:04:08
NAVIGATING THE WINDY GRIDWORLD WITH MONTE CARLO METHODS
- The windy gridworld problem 00:05:33
- Monte who? 00:07:12
- No substitute for action: Policy evaluation with Monte Carlo methods 00:03:53
- Monte Carlo control and exploring starts 00:07:43
- Monte Carlo control without exploring starts 00:06:15
- Off-policy Monte Carlo methods 00:12:06
- Return to the frozen lake and wrapping up Monte Carlo methods 00:06:17
- BALANCING THE CART POLE: TEMPORAL DIFFERENCE LEARNING
CLIMBING THE MOUNTAIN WITH APPROXIMATION METHODS
- The continuous mountain car problem 00:04:31
- Why approximation methods? 00:05:47
- Stochastic gradient descent: The intuition 00:04:05
- Stochastic gradient descent: The mathematics 00:05:18
- Approximate Monte Carlo predictions 00:08:43
- Linear methods and tiling 00:10:54
- TD(0) semi-gradient prediction 00:07:36
- Episodic semi-gradient control: SARSA 00:08:52
- Over the hill: wrapping up approximation methods and the mountain car problem 00:06:10
- Title: Reinforcement Learning in Motion
- Release date: January 2019
- Publisher(s): Manning Publications
- ISBN: 10000MNLV201807