13
Asynchronous Advantage Actor-Critic
This chapter is dedicated to the extension of the advantage actor-critic (A2C) method that we discussed in detail in Chapter 12, The Actor-Critic Method. The extension adds true asynchronous environment interaction, and its full name is asynchronous advantage actor-critic, which is normally abbreviated to A3C. This method is one of the most widely used by reinforcement learning (RL) practitioners.
We will take a look at two approaches for adding asynchronous behavior to the basic A2C method: data-level and gradient-level parallelism. They have different resource requirements and characteristics, which makes them applicable to different situations.
In this chapter, we will:
- Discuss why it is important for ...
Get Deep Reinforcement Learning Hands-On - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.