Book description
Grokking Deep Reinforcement Learning uses engaging exercises to teach you how to build deep learning systems. This book combines annotated Python code with intuitive explanations to explore DRL techniques. You'll see how algorithms function and learn to develop your own DRL agents using evaluative feedback.Table of contents
 Grokking Deep Reinforcement Learning
 Copyright
 dedication
 contents
 front matter

1 Introduction to deep reinforcement learning

What is deep reinforcement learning?
 Deep reinforcement learning is a machine learning approach to artificial intelligence
 Deep reinforcement learning is concerned with creating computer programs
 Deep reinforcement learning agents can solve problems that require intelligence
 Deep reinforcement learning agents improve their behavior through trialanderror learning
 Deep reinforcement learning agents learn from sequential feedback
 Deep reinforcement learning agents learn from evaluative feedback
 Deep reinforcement learning agents learn from sampled feedback
 Deep reinforcement learning agents use powerful nonlinear function approximation
 The past, present, and future of deep reinforcement learning
 The suitability of deep reinforcement learning
 Setting clear twoway expectations
 Summary

What is deep reinforcement learning?

2 Mathematical foundations of reinforcement learning
 Components of reinforcement learning

MDPs: The engine of the environment
 States: Specific configurations of the environment
 Actions: A mechanism to influence the environment
 Transition function: Consequences of agent actions
 Reward signal: Carrots and sticks
 Horizon: Time changes what’s optimal
 Discount: The future is uncertain, value it less
 Extensions to MDPs
 Putting it all together
 Summary
 3 Balancing immediate and longterm goals

4 Balancing the gathering and use of information

The challenge of interpreting evaluative feedback
 Bandits: Singlestate decision problems
 Regret: The cost of exploration
 Approaches to solving MAB environments
 Greedy: Always exploit
 Random: Always explore
 Epsilongreedy: Almost always greedy and sometimes random
 Decaying epsilongreedy: First maximize exploration, then exploitation
 Optimistic initialization: Start off believing it’s a wonderful world
 Strategic exploration
 Summary

The challenge of interpreting evaluative feedback
 5 Evaluating agents’ behaviors
 6 Improving agents’ behaviors
 7 Achieving goals more effectively and efficiently

8 Introduction to valuebased deep reinforcement learning

The kind of feedback deep reinforcement learning agents use
 Deep reinforcement learning agents deal with sequential feedback
 But, if it isn’t sequential, what is it?
 Deep reinforcement learning agents deal with evaluative feedback
 But, if it isn’t evaluative, what is it?
 Deep reinforcement learning agents deal with sampled feedback
 But, if it isn’t sampled, what is it?
 Introduction to function approximation for reinforcement learning

NFQ: The first attempt at valuebased deep reinforcement learning
 First decision point: Selecting a value function to approximate
 Second decision point: Selecting a neural network architecture
 Third decision point: Selecting what to optimize
 Fourth decision point: Selecting the targets for policy evaluation
 Fifth decision point: Selecting an exploration strategy
 Sixth decision point: Selecting a loss function
 Seventh decision point: Selecting an optimization method
 Things that could (and do) go wrong
 Summary

The kind of feedback deep reinforcement learning agents use
 9 More stable valuebased methods

10 Sampleefficient valuebased methods

Dueling DDQN: A reinforcementlearningaware neural network architecture
 Reinforcement learning isn’t a supervised learning problem
 Nuances of valuebased deep reinforcement learning methods
 Advantage of using advantages
 A reinforcementlearningaware architecture
 Building a dueling network
 Reconstructing the actionvalue function
 Continuously updating the target network
 What does the dueling network bring to the table?
 PER: Prioritizing the replay of meaningful experiences
 Summary

Dueling DDQN: A reinforcementlearningaware neural network architecture
 11 Policygradient and actorcritic methods
 12 Advanced actorcritic methods
 13 Toward artificial general intelligence
 index
Product information
 Title: Grokking Deep Reinforcement Learning
 Author(s):
 Release date: December 2020
 Publisher(s): Manning Publications
 ISBN: 9781617295454
You might also like
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
Python Crash Course, 2nd Edition
This is the second edition of the best selling Python book in the world. Python Crash …
book
Python for Data Analysis, 2nd Edition
Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, …
book
Effective Python: 90 Specific Ways to Write Better Python, 2nd Edition
Updated and Expanded for Python 3 It’s easy to start developing programs with Python, which is …