Reinforcement Learning with Python Explained for Beginners

Video description

Learn reinforcement learning from scratch.

About This Video

  • Gain an understanding of all theoretical concepts related to reinforcement learning
  • Master learning models such as model-free learning, Q-learning, temporal difference learning
  • Model the uncertainty of the environment, environment stochastic policies, and environment value functions

In Detail

Although introduced academically decades ago, the recent developments in the field of reinforcement learning have been phenomenal. Domains such as self-driving cars, natural language processing, healthcare industry, online recommender systems, and so on have already seen how RL-based AI agents can bring tremendous gains.

This course will help you get started with reinforcement learning first by establishing the motivation for this field and then covering all the essential topics, such as Markov Decision Processes, policy and rewards, model-free learning, temporal difference learning, and so on.

Each topic is accompanied by exercises and complementing analysis to help you gain practical and tangible coding skills.

By the end of this course, not only will you have gained the necessary understanding to implement RL in your projects but also implemented an actual Frozenlake project using the OpenAI Gym toolkit.

Publisher resources

Download Example Code

Table of contents

  1. Chapter 1 : Introduction to Course and Instructor
    1. Introduction to Course and Instructor
  2. Chapter 2 : Motivation Reinforcement Learning
    1. What is Reinforcement Learning
    2. What is Reinforcement Learning Hiders and Seekers by OpenAI
    3. RL Versus Other ML Frameworks
    4. Why Reinforcement Learning
    5. Examples of Reinforcement Learning
    6. Limitations of Reinforcement Learning
    7. Exercises
  3. Chapter 3 : Terminology of Reinforcement Learning
    1. What is Environment
    2. What is Environment_2
    3. What is Agent
    4. What is State
    5. State Belongs to Environment and not to Agent
    6. What is Action
    7. What is Reward
    8. Goal
    9. Policy
    10. Summary
  4. Chapter 4 : GridWorld Example
    1. Setup 1
    2. Setup 2
    3. Setup 3
    4. Policy Comparison
    5. Deterministic Environment
    6. Stochastic Environment
    7. Stochastic Environment 2
    8. Stochastic Environment 3
    9. Non-Stationary Environment
    10. GridWorld Summary
    11. Activity
  5. Chapter 5 : Markov Decision Process Prerequisites
    1. Probability
    2. Probability 2
    3. Probability 3
    4. Conditional Probability
    5. Conditional Probability Fun Example
    6. Joint Probability
    7. Joint probability 2
    8. Joint probability 3
    9. Expected Value
    10. Conditional Expectation
    11. Modeling Uncertainty of Environment
    12. Modeling Uncertainty of Environment 2
    13. Modeling Uncertainty of Environment 3
    14. Modeling Uncertainty of Environment Stochastic Policy
    15. Modeling Uncertainty of Environment Stochastic Policy 2
    16. Modeling Uncertainty of Environment Value Functions
    17. Running Averages
    18. Running Averages 2
    19. Running Averages as Temporal Difference
    20. Activity
  6. Chapter 6 : Elements of Markov Decision Process
    1. Markov Property
    2. State Space
    3. Action Space
    4. Transition Probabilities
    5. Reward Function
    6. Discount Factor
    7. Summary
    8. Activity
  7. Chapter 7 : More on Reward
    1. MOR Quiz 1
    2. MOR Quiz Solution 1
    3. MOR Quiz 2
    4. MOR Quiz Solution 2
    5. MOR Reward Scaling
    6. MOR Infinite Horizons
    7. MOR Quiz 3
    8. MOR Quiz Solution 3
  8. Chapter 8 : Solving Markov DP
    1. MDP Recap
    2. Value Functions
    3. Optimal Value Function
    4. Optimal Policy
    5. Bellman Equation
    6. Value Iteration
    7. Value Iteration Quiz
    8. Value Iteration Quiz Gamma Missing
    9. Value Iteration Solution
    10. Problems of Value Iteration
    11. Policy Evaluation
    12. Policy Evaluation 2
    13. Policy Evaluation 3
    14. Policy Evaluation d Form Solution
    15. Policy Iteration
    16. State Action Values
    17. V and Q Comparisons
  9. Chapter 9 : Value Approximation
    1. What Does it Mean that MDP is Unknown
    2. Why Transition Probabilities are Important
    3. Model-Based Solutions
    4. Model-Free Solutions
    5. Monte-Carlo Learning
    6. Monte-Carlo Learning Example
    7. Monte-Carlo Learning Limitations
  10. Chapter 10 : Temporal Differencing - Q Learning
    1. Running Average
    2. Learning Rate
    3. Learning Equation
    4. TD Algorithm
    5. Exploration Versus Exploitation
    6. Epsilon Greedy Policy
    7. SARSA
    8. Q-Learning
    9. Q-Learning Implementation for MAPROVER Clipped
  11. Chapter 11 : TD Lambda
    1. N-Step Look a Head
    2. Formulation
    3. Values
    4. TD Q-Learning TD Lambda
    5. TD Q-Learning TD Lambda TD(Lambda) MAPRover Activity
  12. Chapter 12 : Project Frozenlake (Open AI Gym)
    1. Frozenlake 1
    2. Frozenlake Implementation

Product information

  • Title: Reinforcement Learning with Python Explained for Beginners
  • Author(s): AI Sciences OU
  • Release date: February 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781801072274