June 2018
Intermediate to advanced
546 pages
13h 30m
English
To explain the Bellman equation, it's better to go a bit abstract. Don't be afraid, I'll provide the concrete examples later to support your intuition! Let's start with a deterministic case, when all our actions have a 100% guaranteed outcome. Imagine that our agent observes state
and has N available actions. Every action leads to another state,
, with a respective reward,
. Also assume that we know the values,