O'Reilly logo

Deep Reinforcement Learning Hands-On by Maxim Lapan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Connect4 bot

To see the method in action, let's implement AlphaGo Zero for Connect4. The game is for two players with fields 6 × 7. Players have disks of two different colors, which they drop in turn to any of the seven columns. Disks fall to the bottom, stacking vertically. The game objective is to be the first to form a horizontal, vertical or diagonal group of four disks of the same color. Two game situations are shown in the diagram. On the first, the red player has just won, while on the second, the blue player is going to form a group.

Connect4 bot

Figure 2: Two game positions in Connect4

Despite the simplicity, this game has 4.5*1012 different game states, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required