11Reinforcement Learning

Reinforcement learning is an area of machine learning concerned with how agents should take actions in an environment in order to maximize the notion of a so‐called cumulative reward. The idea goes back to the work by the Russian Nobel laureate Ivan Pavlov on classical conditioning where he showed that you can train dogs by rewarding or punishing them. To formalize this, we consider the dog to be in different states x Subscript k at different stages k in time. The value of the state x Subscript k plus 1 depends on the current state x Subscript k and the action u Subscript k we take during our training. Each and every state x Subscript k is also related to a reward r Subscript k. It is then desirable to maximize the sum, possibly discounted, of rewards over some set of stages, say , where could be infinity. In reinforcement learning, the action is ...

Get Optimization for Learning and Control now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.