This module contains helper code for turning Go board representations into proper TensorFlow tensors, which can be provided to PolicyValueNetwork. The main function, extract_features, takes board_state, which is our representation of a Go board, and turns it into a tensor of the [batch_size, N, N, 17] shape, where N is the shape of the board (which is by default 9), and 17 is the number of feature channels, representing the past moves as well as the color to play:

import numpy as npfrom config import GOPARAMETERSdef stone_features(board_state):    # 16 planes, where every other plane represents the stones of a particular color    # which means we track the stones of the last 8 moves.    features = np.zeros([16, GOPARAMETERS.N, GOPARAMETERS.N], ...

Get Python Reinforcement Learning Projects now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.