The distributional RL method (https://arxiv.org/abs/1707.06887) is about learning to approximate the distribution of returns rather than the expected (average) return. The distributional RL method proposes the use of probability masses placed on a discrete support to model such distributions. This, in essence, means that rather than trying to model one action-value given the state, a distribution of action-values for each action given the state is sought. Without going too much into the details (as that would require a lot of background information), we will look at one of the key contributions of this method to RL in general, which is the formulation of the Distributional Bellman equation. As you may recall from the previous ...
Distributional RL
Get Hands-On Intelligent Agents with OpenAI Gym now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.