Monte Carlo tree search

The second component of our AlphaGo Zero agent is the MCTS algorithm. In our module, we implement an MCTreeSearchNode class, which represents each node in an MCTS tree during a search. This is then used by the agent implemented in to perform MCTS using PolicyValueNetwork, which we implemented just now.

Get Python Reinforcement Learning Projects now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.