The workers' functionalities are defined in the worker function, which was previously passed as an argument to mp.Process. We cannot go through all the code because it'd take too much time and space to explain, but we'll explain the core components here. As always, the full implementation is available in this book's repository on GitHub. So, if you are interested in looking at it in more depth, take the time to examine the code on GitHub.
In the first few lines of worker, the computational graph is created to run the policy and optimize it. Specifically, the policy is a multi-layer perceptron with tanh nonlinearities as the activation function. In this case, Adam is used to apply the expected gradient that's computed following the ...