© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
N. SanghiDeep Reinforcement Learning with Pythonhttps://doi.org/10.1007/979-8-8688-0273-7_10

10. Integrated Planning and Learning

Nimish Sanghi1  
(1)
Bangalore, India
 

Studying topics separately followed by learning about them together has been a recurring theme in this book. We first looked at model-based algorithms in Chapter 3. In this setup, we knew the model dynamics of the world the agent was operating in. The agent used the knowledge of model dynamics along with Bellman equations to first carry out the evaluation/prediction task to learn the state or state-action values. It then followed this up by improving the policy to get the optimal behavior, which ...

Get Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.