Model-based RL
Having a model of the environment means that the state transitions and the rewards can be predicted for each state-action tuple (without any interaction with the real environment). As we already mentioned, the model is known only in limited cases, but when it is known, it can be used in many different ways. The most obvious application of the model is to use it to plan future actions. Planning is a concept used to express the organization of future moves when the consequences of the next actions are already known. For example, if you know exactly what moves your enemy will make, you can think ahead and plan all your actions before executing the first one. As a downside, planning can be very expensive and isn't a trivial process. ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access