Book description
An examplerich guide for beginners to start their reinforcement and deep reinforcement learning journey with stateoftheart distinct algorithms
Key Features
 Covers a vast spectrum of basictoadvanced RL algorithms with mathematical explanations of each algorithm
 Learn how to implement algorithms with code by following examples with linebyline explanations
 Explore the latest RL methodologies such as DDPG, PPO, and the use of expert demonstrations
Book Description
With significant enhancements in the quality and quantity of algorithms in recent years, this second edition of HandsOn Reinforcement Learning with Python has been revamped into an examplerich guide to learning stateoftheart reinforcement learning (RL) and deep RL algorithms with TensorFlow 2 and the OpenAI Gym toolkit.
In addition to exploring RL basics and foundational concepts such as Bellman equation, Markov decision processes, and dynamic programming algorithms, this second edition dives deep into the full spectrum of valuebased, policybased, and actorcritic RL methods. It explores stateoftheart algorithms such as DQN, TRPO, PPO and ACKTR, DDPG, TD3, and SAC in depth, demystifying the underlying math and demonstrating implementations through simple code examples.
The book has several new chapters dedicated to new RL techniques, including distributional RL, imitation learning, inverse RL, and meta RL. You will learn to leverage stable baselines, an improvement of OpenAI's baseline library, to effortlessly implement popular RL algorithms. The book concludes with an overview of promising approaches such as metalearning and imagination augmented agents in research.
By the end, you will become skilled in effectively employing RL and deep RL in your realworld projects.
What you will learn
 Understand core RL concepts including the methodologies, math, and code
 Train an agent to solve Blackjack, FrozenLake, and many other problems using OpenAI Gym
 Train an agent to play Ms PacMan using a Deep Q Network
 Learn policybased, valuebased, and actorcritic methods
 Master the math behind DDPG, TD3, TRPO, PPO, and many others
 Explore new avenues such as the distributional RL, meta RL, and inverse RL
 Use Stable Baselines to train an agent to walk and play Atari games
Who this book is for
If you're a machine learning developer with little or no experience with neural networks interested in artificial intelligence and want to learn about reinforcement learning from scratch, this book is for you.
Basic familiarity with linear algebra, calculus, and the Python programming language is required. Some experience with TensorFlow would be a plus.
Publisher resources
Table of contents
 Preface

Fundamentals of Reinforcement Learning
 Key elements of RL
 The basic idea of RL
 The RL algorithm
 How RL differs from other ML paradigms
 Markov Decision Processes
 Fundamental concepts of RL
 Applications of RL
 RL glossary
 Summary
 Questions
 Further reading
 A Guide to the Gym Toolkit
 The Bellman Equation and Dynamic Programming

Monte Carlo Methods
 Understanding the Monte Carlo method
 Prediction and control tasks
 Monte Carlo prediction
 Monte Carlo control
 Is the MC method applicable to all tasks?
 Summary
 Questions
 Understanding Temporal Difference Learning
 Case Study – The MAB Problem

Deep Learning Foundations
 Biological and artificial neurons
 ANN and its layers
 Exploring activation functions
 Forward propagation in ANNs
 How does an ANN learn?
 Putting it all together
 Recurrent Neural Networks
 LSTM to the rescue
 What are CNNs?
 The architecture of CNNs
 Generative adversarial networks
 Total loss
 Summary
 Questions
 Further reading
 A Primer on TensorFlow
 Deep Q Network and Its Variants
 Policy Gradient Method
 ActorCritic Methods – A2C and A3C
 Learning DDPG, TD3, and SAC
 TRPO, PPO, and ACKTR Methods

Distributional Reinforcement Learning
 Why distributional reinforcement learning?
 Categorical DQN
 Quantile Regression DQN
 Distributed Distributional DDPG
 Summary
 Questions
 Further reading
 Imitation Learning and Inverse RL

Deep Reinforcement Learning with Stable Baselines
 Installing Stable Baselines
 Creating our first agent with Stable Baselines
 Vectorized environments
 Integrating custom environments
 Playing Atari games with a DQN and its variants
 Lunar lander using A2C
 Swinging up a pendulum using DDPG
 Training an agent to walk using TRPO
 Training a cheetah bot to run using PPO
 Implementing GAIL
 Summary
 Questions
 Further reading
 Reinforcement Learning Frontiers

Appendix 1 – Reinforcement Learning Algorithms
 Reinforcement learning algorithm
 Value Iteration
 Policy Iteration
 FirstVisit MC Prediction
 EveryVisit MC Prediction
 MC Prediction – the Q Function
 MC Control Method
 OnPolicy MC Control – Exploring starts
 OnPolicy MC Control – EpsilonGreedy
 OffPolicy MC Control
 TD Prediction
 OnPolicy TD Control – SARSA
 OffPolicy TD Control – Q Learning
 Deep Q Learning
 Double DQN
 REINFORCE Policy Gradient
 Policy Gradient with RewardToGo
 REINFORCE with Baseline
 Advantage Actor Critic
 Asynchronous Advantage ActorCritic
 Deep Deterministic Policy Gradient
 Twin Delayed DDPG
 Soft ActorCritic
 Trust Region Policy Optimization
 PPOClipped
 PPOPenalty
 Categorical DQN
 Distributed Distributional DDPG
 DAgger
 Deep Q learning from demonstrations
 MaxEnt Inverse Reinforcement Learning
 MAML in Reinforcement Learning

Appendix 2 – Assessments
 Chapter 1 – Fundamentals of Reinforcement Learning
 Chapter 2 – A Guide to the Gym Toolkit
 Chapter 3 – The Bellman Equation and Dynamic Programming
 Chapter 4 – Monte Carlo Methods
 Chapter 5 – Understanding Temporal Difference Learning
 Chapter 6 – Case Study – The MAB Problem
 Chapter 7 – Deep Learning Foundations
 Chapter 8 – A Primer on TensorFlow
 Chapter 9 – Deep Q Network and Its Variants
 Chapter 10 – Policy Gradient Method
 Chapter 11 – ActorCritic Methods – A2C and A3C
 Chapter 12 – Learning DDPG, TD3, and SAC
 Chapter 13 – TRPO, PPO, and ACKTR Methods
 Chapter 14 – Distributional Reinforcement Learning
 Chapter 15 – Imitation Learning and Inverse RL
 Chapter 16 – Deep Reinforcement Learning with Stable Baselines
 Chapter 17 – Reinforcement Learning Frontiers
 Other Books You May Enjoy
 Index
Product information
 Title: Deep Reinforcement Learning with Python  Second Edition
 Author(s):
 Release date: September 2020
 Publisher(s): Packt Publishing
 ISBN: 9781839210686
You might also like
book
Deep Learning with Python, Second Edition
Printed in full color! Unlock the groundbreaking advances of deep learning with this extensively revised new …
book
Reinforcement Learning Algorithms with Python
Develop selflearning algorithms and agents using TensorFlow and other Python tools, frameworks, and libraries Key Features …
book
Foundations of Deep Reinforcement Learning: Theory and Practice in Python
The Contemporary Introduction to Deep Reinforcement Learning that Combines Theory and Practice Deep reinforcement learning (deep …
book
Deep Reinforcement Learning HandsOn  Second Edition
New edition of the bestselling guide to deep reinforcement learning and how it's used to solve …