September 2018
Intermediate to advanced
296 pages
9h 10m
English
The first step in training AlphaGo involves training policy networks on games played by two professionals (in board games such as chess and Go, it is common to keep records of historical games, the board state, and the moves made by each player at every turn). The main idea is to make AlphaGo learn and understand how human experts play Go. More formally, given a board state,
, and set of actions,
, we would like a policy network, , to predict the next move the human makes. The data consists of pairs of sampled ...
Read now
Unlock full access