September 2018
Intermediate to advanced
288 pages
7h 38m
English
So far, we have seen the problem-solving approach of Monte Carlo methods. But, our goal is to manage the interaction with the environment with this technology. In the previous sections, we said that Monte Carlo methods do not require the presence of a model of the environment to estimate the value function and discover excellent policies. This means that Monte Carlo is model-free: no knowledge of Markov Decision Process (MDP) transitions/rewards is required. So, we do not need to have previously modeled the environment, but the necessary information will be collected during interaction with the environment (online learning). Monte Carlo methods learn directly from episodes of experience, where an episode ...
Read now
Unlock full access