In the previous three chapters, we looked at various approaches to planning and control, first using dynamic programming (DP), then using the Monte Carlo approach (MC), and finally using the temporal difference (TD) approach. In all these approaches, we always looked at problems where the state space and actions were both discrete. Only in the previous chapter toward the end did we talk about Q-learning in a continuous state space. We discretized the state values using an arbitrary approach and trained a learning model. In this chapter, we are going to extend that approach ...
5. Function Approximation
Get Deep Reinforcement Learning with Python: With PyTorch, TensorFlow and OpenAI Gym now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.