Book description
Grokking Deep Reinforcement Learning uses engaging exercises to teach you how to build deep learning systems. This book combines annotated Python code with intuitive explanations to explore DRL techniques. You'll see how algorithms function and learn to develop your own DRL agents using evaluative feedback.Table of contents
- Grokking Deep Reinforcement Learning
- Copyright
- dedication
- contents
- front matter
-
1 Introduction to deep reinforcement learning
-
What is deep reinforcement learning?
- Deep reinforcement learning is a machine learning approach to artificial intelligence
- Deep reinforcement learning is concerned with creating computer programs
- Deep reinforcement learning agents can solve problems that require intelligence
- Deep reinforcement learning agents improve their behavior through trial-and-error learning
- Deep reinforcement learning agents learn from sequential feedback
- Deep reinforcement learning agents learn from evaluative feedback
- Deep reinforcement learning agents learn from sampled feedback
- Deep reinforcement learning agents use powerful non-linear function approximation
- The past, present, and future of deep reinforcement learning
- The suitability of deep reinforcement learning
- Setting clear two-way expectations
- Summary
-
What is deep reinforcement learning?
-
2 Mathematical foundations of reinforcement learning
- Components of reinforcement learning
-
MDPs: The engine of the environment
- States: Specific configurations of the environment
- Actions: A mechanism to influence the environment
- Transition function: Consequences of agent actions
- Reward signal: Carrots and sticks
- Horizon: Time changes whatâs optimal
- Discount: The future is uncertain, value it less
- Extensions to MDPs
- Putting it all together
- Summary
- 3 Balancing immediate and long-term goals
-
4 Balancing the gathering and use of information
-
The challenge of interpreting evaluative feedback
- Bandits: Single-state decision problems
- Regret: The cost of exploration
- Approaches to solving MAB environments
- Greedy: Always exploit
- Random: Always explore
- Epsilon-greedy: Almost always greedy and sometimes random
- Decaying epsilon-greedy: First maximize exploration, then exploitation
- Optimistic initialization: Start off believing itâs a wonderful world
- Strategic exploration
- Summary
-
The challenge of interpreting evaluative feedback
- 5 Evaluating agentsâ behaviors
- 6 Improving agentsâ behaviors
- 7 Achieving goals more effectively and efficiently
-
8 Introduction to value-based deep reinforcement learning
-
The kind of feedback deep reinforcement learning agents use
- Deep reinforcement learning agents deal with sequential feedback
- But, if it isnât sequential, what is it?
- Deep reinforcement learning agents deal with evaluative feedback
- But, if it isnât evaluative, what is it?
- Deep reinforcement learning agents deal with sampled feedback
- But, if it isnât sampled, what is it?
- Introduction to function approximation for reinforcement learning
-
NFQ: The first attempt at value-based deep reinforcement learning
- First decision point: Selecting a value function to approximate
- Second decision point: Selecting a neural network architecture
- Third decision point: Selecting what to optimize
- Fourth decision point: Selecting the targets for policy evaluation
- Fifth decision point: Selecting an exploration strategy
- Sixth decision point: Selecting a loss function
- Seventh decision point: Selecting an optimization method
- Things that could (and do) go wrong
- Summary
-
The kind of feedback deep reinforcement learning agents use
- 9 More stable value-based methods
-
10 Sample-efficient value-based methods
-
Dueling DDQN: A reinforcement-learning-aware neural network architecture
- Reinforcement learning isnât a supervised learning problem
- Nuances of value-based deep reinforcement learning methods
- Advantage of using advantages
- A reinforcement-learning-aware architecture
- Building a dueling network
- Reconstructing the action-value function
- Continuously updating the target network
- What does the dueling network bring to the table?
- PER: Prioritizing the replay of meaningful experiences
- Summary
-
Dueling DDQN: A reinforcement-learning-aware neural network architecture
- 11 Policy-gradient and actor-critic methods
- 12 Advanced actor-critic methods
- 13 Toward artificial general intelligence
- index
Product information
- Title: Grokking Deep Reinforcement Learning
- Author(s):
- Release date: December 2020
- Publisher(s): Manning Publications
- ISBN: 9781617295454
You might also like
book
Analytical Skills for AI and Data Science
While several market-leading companies have successfully transformed their business models by following data- and AI-driven paths, …
book
Radar Trends to Watch: September 2023
Read about the latest developments on O'Reilly Media's Radar.
book
Reinforcement Learning and Stochastic Optimization
REINFORCEMENT LEARNING AND STOCHASTIC OPTIMIZATION Clearing the jungle of stochastic optimization Sequential decision problems, which consist …
book
Deciphering Data Architectures
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern …