Now, we want to test the Q-learning algorithm using a smaller checkerboard environment and a neural network (with Keras). The main difference from the previous examples is that now, the state is represented by a screenshot of the current configuration; hence, the model has to learn how to associate a value with each input image and action. This isn't actual deep Q-learning (which is based on Deep Convolutional Networks, and requires more complex environments that we cannot discuss in this book), but it shows how such a model can learn an optimal policy with the same input provided to a human being. In order to reduce the training time, we are considering a square checkerboard environment, with four negative ...
Q-learning using a neural network
Get Mastering Machine Learning Algorithms now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.