The Bellman equation of optimality

To explain the Bellman equation, it's better to go a bit abstract. Don't be afraid, I'll provide the concrete examples later to support your intuition! Let's start with a deterministic case, when all our actions have a 100% guaranteed outcome. Imagine that our agent observes state The Bellman equation of optimality and has N available actions. Every action leads to another state, The Bellman equation of optimality, with a respective reward, The Bellman equation of optimality. Also assume that we know the values,

Get Deep Reinforcement Learning Hands-On now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.