September 2018
Intermediate to advanced
296 pages
9h 10m
English
The goal of a reinforcement learning agent is to learn to perform a task well in an environment. Mathematically, this means to maximize the cumulative reward, R, which can be expressed in the following equation:

We are simply calculating a weighted sum of the reward received at each timestep.
is called the discount factor, which is a scalar value between 0 and 1. The idea is that the later a reward comes, the less valuable it becomes. This reflects our perspectives on rewards as well; that we'd rather receive $100 now rather than a ...
Read now
Unlock full access