A policy is an algorithm or a set of rules that describe how an agent makes its decisions. An example policy can be the strategy an investor uses to trade stocks, where the investor buys a stock when its price goes down and sells the stock when the price goes up.

More formally, a policy is a function, usually denoted as , that maps a state, , to an action, :

This means that an agent decides its action given its current state. This function ...

Get Python Reinforcement Learning Projects now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.