Chapter 5. Tabular Learning and the Bellman Equation

In the previous chapter, we got acquainted with our first Reinforcement Learning (RL) method, cross-entropy, and saw its strengths and weaknesses. In this new part of the book, we'll look at another group of methods, called Q-learning, which have much more flexibility and power.

This chapter will establish the required background shared by those methods. We'll also revisit the FrozenLake environment and show how new concepts will fit with this environment and help us to address the issues of the environment's uncertainty.

Value, state, and optimality

You may remember our definition of the value of the state in Chapter 1, What is Reinforcement Learning?. This is a very important notion and the time ...

Get Deep Reinforcement Learning Hands-On now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Deep Reinforcement Learning Hands-On by Maxim Lapan

Chapter 5. Tabular Learning and the Bellman Equation

Value, state, and optimality

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly