January 2019
Intermediate to advanced
384 pages
13h 27m
English
This chapter covers
Chapter 9 showed you how to make a Go-playing program play against itself and save the results in experience data. That’s the first half of reinforcement learning; the next step is to use experience data to improve the agent so that it wins more often. The agent from the previous chapter used a neural network to select which move to play. As a thought experiment, imagine you shift every weight in the network by a random amount. Then the agent will select different moves. Just by luck, some of those new moves will be better ...