Skip to Main Content
Deep Reinforcement Learning Hands-On
book

Deep Reinforcement Learning Hands-On

by Oleg Vasilev, Maxim Lapan, Martijn van Otterlo, Mikhail Yurushkin, Basem O. F. Alijla
June 2018
Intermediate to advanced content levelIntermediate to advanced
546 pages
13h 30m
English
Packt Publishing
Content preview from Deep Reinforcement Learning Hands-On

The AlphaGo Zero method

Overview

At a high level, the method consists of three components, all of which will be explained in detail later, so don't worry if something is not completely clear from this section:

  • We traverse constantly the game tree, using the Monte-Carlo Tree Search (MCTS) algorithm, the core idea of which is to semi-randomly walk down the game states, expanding them and gathering statistics about the frequency of moves and underlying game outcomes. As the game tree is huge, both in terms of the depth and width, we're not trying to build the full tree, just randomly sampling the most promising paths of it (that's the source of the method's name).
  • At every moment, we have a best player, which is the model used to generate the data via ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Grokking Deep Reinforcement Learning

Grokking Deep Reinforcement Learning

Miguel Morales

Publisher Resources

ISBN: 9781788834247Supplemental Content