Markov processes
At the crux of reinforcement learning is the Markov Decision process (MDP). Markov processes are random strings of events where the future probabilities of events happening are determined by the probability of the most recent event. They extend the basic Markov Chain by adding rewards and decisions to the process. The fundamental problem of reinforcement learning can be modeled as an MDP. Markov models are a general class of models that are utilized to solve MDPs.
Markov models rely on a very important property, called the Markov property, where the current state in a Markov process completely characterizes and explains the state of the world at that time; everything we need to know about predicting future events is dependent ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access