13 Asynchronous Advantage Actor-Critic

This chapter is dedicated to the extension of the advantage actor-critic (A2C) method that we discussed in detail in Chapter 12, The Actor-Critic Method. The extension adds true asynchronous environment interaction, and its full name is asynchronous advantage actor-critic, which is normally abbreviated to A3C. This method is one of the most widely used by reinforcement learning (RL) practitioners.

We will take a look at two approaches for adding asynchronous behavior to the basic A2C method: data-level and gradient-level parallelism. They have different resource requirements and characteristics, which makes them applicable to different situations.

In this chapter, we will:

Discuss why it is important for ...

Get Deep Reinforcement Learning Hands-On - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Deep Reinforcement Learning Hands-On - Second Edition by Maxim Lapan

13

Asynchronous Advantage Actor-Critic

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly