13

Asynchronous Advantage Actor-Critic

This chapter is dedicated to the extension of the advantage actor-critic (A2C) method that we discussed in detail in Chapter 12, The Actor-Critic Method. The extension adds true asynchronous environment interaction, and its full name is asynchronous advantage actor-critic, which is normally abbreviated to A3C. This method is one of the most widely used by reinforcement learning (RL) practitioners.

We will take a look at two approaches for adding asynchronous behavior to the basic A2C method: data-level and gradient-level parallelism. They have different resource requirements and characteristics, which makes them applicable to different situations.

In this chapter, we will:

  • Discuss why it is important for ...

Get Deep Reinforcement Learning Hands-On - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.