4
The Cross-Entropy Method
In the last chapter, you learned about PyTorch. In this chapter, we will wrap up Part 1 of this book and you will become familiar with one of the reinforcement learning (RL) methods: cross-entropy.
Despite the fact that it is much less famous than other tools in the RL practitioner’s toolbox, such as deep Q-network (DQN) or advantage actor-critic (A2C), the cross-entropy method has its own strengths. Firstly, the cross-entropy method is really simple, which makes it an easy method to follow. For example, its implementation on PyTorch is less than 100 lines of code.
Secondly, the method has good convergence. In simple environments that don’t require you to learn complex, multistep policies and that have short episodes ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access