Chapter 12. Reinforcement learning with actor-critic methods
This chapter covers
- Using advantage to make reinforcement learning more efficient
- Making a self-improving game AI with the actor-critic method
- Designing and training multi-output neural networks in Keras
If you’re learning to play Go, one of the best ways to improve is to get a stronger player to review your games. Sometimes the most useful feedback just points out where you won or lost the game. The reviewer might give comments like, “You were already far behind by move 30” or “At move 110, you had a winning position, but your opponent turned it around by move 130.”
Why is this feedback helpful? You may not have time to scrutinize all 300 moves in a game, but you can focus your ...