June 2018
Intermediate to advanced
546 pages
13h 30m
English
In this chapter, we've checked three different methods aiming to improve the stability of the stochastic policy gradient and compared them to A2C implementation on two continuous control problems. With methods from the previous chapter (DDPG and D4PG), they create basic tools to work with a continuous control domain.
In the next chapter, we'll switch to a different set of RL methods that have been becoming popular recently: black-box or gradient-free methods.
Read now
Unlock full access