Deep Reinforcement Learning Hands-On
by Oleg Vasilev, Maxim Lapan, Martijn van Otterlo, Mikhail Yurushkin, Basem O. F. Alijla
A3C – data parallelism
The first version of A3C parallelization that we'll check (which was outlined on Figure 2) has both one main process which carries out training and several children processes communicating with environments and gathering experience to train on. For simplicity and efficiency, the neural network (NN) weights broadcasting from the trainer process is not implemented. Instead of explicitly gathering and sending weights to children, the network is shared between all processes using PyTorch built-in capabilities, allowing us to use the same nn.Module instance with all its weights in different processes by calling the share_memory() method on NN creation. Under the hood, this method has zero overhead for CUDA (as GPU memory is shared ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access