Chapter 7. Distributional DQN: Getting the full story

This chapter covers

  • Why a full probability distribution is better than a single number
  • Extending ordinary deep Q-networks to output full probability distributions over Q values
  • Implementing a distributional variant of DQN to play Atari Freeway
  • Understanding the ordinary Bellman equation and its distributional variant
  • Prioritizing experience replay to improve training speed

We introduced Q-learning in chapter 3 as a way to determine the value of taking each possible action in a given state; the values were called action values or Q values. This allowed us to apply a policy to these action values and to choose actions associated with the highest action values. In this chapter we will extend ...

Get Deep Reinforcement Learning in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.