MDP

Now that we have a basic understanding of MRPs, we can move on to MDPs. An MDP is an MRP which also involved decisions. All the states in the environment are also Markov, hence the next state is only dependent on the current state. Formally, an MDP can be represented using  where S is the state space, A is the action set, P is the state transition probability function, R is the reward function, and  is the discount rate. The state transition probability function P and the reward function R are formally defined as:

We can also formally define ...

Get Hands-On Markov Models with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.