Intrinsic reward
A potential fair alternative is to develop a reward function that is intrinsic to the agent, meaning that it's controlled exclusively by the belief of the agent. This method comes close to the approach that's used by newborns to learn. In fact, they employ a pure explorative paradigm to navigate the world without an immediate benefit. Nonetheless, the knowledge that's acquired may be useful later in life.
The intrinsic reward is a sort of exploration bonus based on the estimation of the novelty of a state. The more unfamiliar a state is, the higher the intrinsic reward. Thus, with it, the agent is incentivized to explore new spaces of the environment. It may have become clear by now that the intrinsic reward can be used as ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access