AlphaGo Zero
Recently, DeepMind published an article about AlphaGo Zero, the latest evolution of AlphaGo. According to the results they have published, AlphaGo Zero is even more powerful and the strongest Go player in History. AlphaGo starts tabula rasa, that is, it starts from a blank state, and it uses only the board states and the games it plays against itself to tune the neural network and predict the right moves.
AlphaGo Zero uses a deep neural network, that takes as an input the raw board representations (present and history) and outputs both move probabilities and a value. Thus this neural network combines the role of both policy network and value network. The network is trained from games of self-play, unlike previous AlphaGo versions ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access