Index
A
Action value functions
Actor-critic (AC) methods
A2C
advantage
asynchronous advantage
implementing A2C
Advantage actor critic (A2C)
AlphaGo
branching factor
general approaches
MCTS
neural network training
policies
RL network
SL policy network
standard search tree
Artificial intelligence (AI)
Atari games
Auto-differentiation libraries
Autonomous vehicles (AVs)
B
Background planning
Back propagation
Backup diagrams
Baselines
Bayesian approach
Behavior cloning
Behavior policy
Bellman equation
algorithms
MDP
optimal
transition dynamics
Bias
Bootstrapping
C
CartPole environment
Categorial 51-Atom DQN
Cliff-walking
compute_projection function
compute_reward function
Convolutional neural network (CNN)
Covariance matrix adaptation-evolutionary strategy (CMA-ES)
Cross-entropy method
D
DAgger ...

Get Deep Reinforcement Learning with Python: With PyTorch, TensorFlow and OpenAI Gym now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.