© Nimish Sanghi 2021
N. SanghiDeep Reinforcement Learning with Pythonhttps://doi.org/10.1007/978-1-4842-6809-4_4

4. Model-Free Approaches

Nimish Sanghi1  
Bangalore, India

In the previous chapter, we looked at dynamic programming where we knew the model dynamics p(s, r| s, a), and this knowledge was used to “plan” the optimal actions. This is also known as the planning problem . In this chapter, we will shift our focus and look at learning problems , i.e., a setup where the model dynamics are not known. We will learn value and action-value functions by sampling, i.e., collecting experience by following some policy in the real world or by running the agent through a policy in simulation. There is another class of problems where we find the model-free ...

Get Deep Reinforcement Learning with Python: With PyTorch, TensorFlow and OpenAI Gym now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.