October 2018
Beginner
362 pages
9h 32m
English
A policy, simply stated, is a way of acting; your place of employment or education has policies about how, what, and when you can do things. This term is no different when used in the context of reinforcement learning. We use policies to map states to potential actions that a reinforcement learning agent can take. Mathematically speaking, policies in reinforcement learning are represented by the Greek letter π, and they tell an agent what action to take at any given state in an MDP. Let's look at a simple MDP to examine this; imagine that you are up late at night, you are sleepy, but maybe you are stuck into a good movie. Do you stay awake or go to bed? In this scenario, we would have three states:
Read now
Unlock full access