Chapter 7: Policy-Based Methods

Value-based methods that we covered in the previous chapter achieve great results in many environments with discrete control spaces. However, a lot of applications, such as robotics, require continuous control. In this chapter, we go into another important class of algorithms, called policy-based methods, which enable us to solve continuous-control problems. In addition, these methods directly optimize a policy network, and hence stand on a stronger theoretical foundation. Finally, policy-based methods are able to learn truly stochastic policies, which are needed in partially observable environments and games, which value-based methods could not learn. All in all, policy-based approaches complement value-based ...

Get Mastering Reinforcement Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.