Book description
Grokking Deep Reinforcement Learning uses engaging exercises to teach you how to build deep learning systems. This book combines annotated Python code with intuitive explanations to explore DRL techniques. You'll see how algorithms function and learn to develop your own DRL agents using evaluative feedback.Table of contents
- Grokking Deep Reinforcement Learning
- Copyright
- dedication
- contents
- front matter
-
1 Introduction to deep reinforcement learning
-
What is deep reinforcement learning?
- Deep reinforcement learning is a machine learning approach to artificial intelligence
- Deep reinforcement learning is concerned with creating computer programs
- Deep reinforcement learning agents can solve problems that require intelligence
- Deep reinforcement learning agents improve their behavior through trial-and-error learning
- Deep reinforcement learning agents learn from sequential feedback
- Deep reinforcement learning agents learn from evaluative feedback
- Deep reinforcement learning agents learn from sampled feedback
- Deep reinforcement learning agents use powerful non-linear function approximation
- The past, present, and future of deep reinforcement learning
- The suitability of deep reinforcement learning
- Setting clear two-way expectations
- Summary
-
What is deep reinforcement learning?
-
2 Mathematical foundations of reinforcement learning
- Components of reinforcement learning
-
MDPs: The engine of the environment
- States: Specific configurations of the environment
- Actions: A mechanism to influence the environment
- Transition function: Consequences of agent actions
- Reward signal: Carrots and sticks
- Horizon: Time changes what’s optimal
- Discount: The future is uncertain, value it less
- Extensions to MDPs
- Putting it all together
- Summary
- 3 Balancing immediate and long-term goals
-
4 Balancing the gathering and use of information
-
The challenge of interpreting evaluative feedback
- Bandits: Single-state decision problems
- Regret: The cost of exploration
- Approaches to solving MAB environments
- Greedy: Always exploit
- Random: Always explore
- Epsilon-greedy: Almost always greedy and sometimes random
- Decaying epsilon-greedy: First maximize exploration, then exploitation
- Optimistic initialization: Start off believing it’s a wonderful world
- Strategic exploration
- Summary
-
The challenge of interpreting evaluative feedback
- 5 Evaluating agents’ behaviors
- 6 Improving agents’ behaviors
- 7 Achieving goals more effectively and efficiently
-
8 Introduction to value-based deep reinforcement learning
-
The kind of feedback deep reinforcement learning agents use
- Deep reinforcement learning agents deal with sequential feedback
- But, if it isn’t sequential, what is it?
- Deep reinforcement learning agents deal with evaluative feedback
- But, if it isn’t evaluative, what is it?
- Deep reinforcement learning agents deal with sampled feedback
- But, if it isn’t sampled, what is it?
- Introduction to function approximation for reinforcement learning
-
NFQ: The first attempt at value-based deep reinforcement learning
- First decision point: Selecting a value function to approximate
- Second decision point: Selecting a neural network architecture
- Third decision point: Selecting what to optimize
- Fourth decision point: Selecting the targets for policy evaluation
- Fifth decision point: Selecting an exploration strategy
- Sixth decision point: Selecting a loss function
- Seventh decision point: Selecting an optimization method
- Things that could (and do) go wrong
- Summary
-
The kind of feedback deep reinforcement learning agents use
- 9 More stable value-based methods
-
10 Sample-efficient value-based methods
-
Dueling DDQN: A reinforcement-learning-aware neural network architecture
- Reinforcement learning isn’t a supervised learning problem
- Nuances of value-based deep reinforcement learning methods
- Advantage of using advantages
- A reinforcement-learning-aware architecture
- Building a dueling network
- Reconstructing the action-value function
- Continuously updating the target network
- What does the dueling network bring to the table?
- PER: Prioritizing the replay of meaningful experiences
- Summary
-
Dueling DDQN: A reinforcement-learning-aware neural network architecture
- 11 Policy-gradient and actor-critic methods
- 12 Advanced actor-critic methods
- 13 Toward artificial general intelligence
- index
Product information
- Title: Grokking Deep Reinforcement Learning
- Author(s):
- Release date: December 2020
- Publisher(s): Manning Publications
- ISBN: 9781617295454
You might also like
book
Clean Code: A Handbook of Agile Software Craftsmanship
Even bad code can function. But if code isn't clean, it can bring a development organization …
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
40 Algorithms Every Programmer Should Know
Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …
book
Tiny Python Projects
The projects are tiny, but the rewards are big: each chapter in Tiny Python Projects challenges …