January 2020
Intermediate to advanced
432 pages
10h 18m
English
Now that we understand the Monte Carlo method, we need to understand how to apply it to RL. Recall that our expectation now is that our environment is relatively unknown, that is, we do not have a model. Instead, we now need to develop an algorithm by which to explore the environment by trial and error. Then, we can take all of those various trials and, by using Monte Carlo, average them out and determine a best or better policy. We can then use that improved policy to continue exploring the environment for further improvements. Essentially, our algorithm becomes an explorer rather than a planner and this is why we now refer to it as an agent.
Read now
Unlock full access