Summary
In this chapter, we had look into the background of RL and what a DQN is, including the Q-learning algorithm. We have seen how DQNs offer a unique (relative to the other architectures that we've discussed so far) approach to solving problems. We are not supplying output labels in the traditional sense as with, say, our CNN from Chapter 5, Next Word Prediction with Recurrent Neural Networks, which processed CIFAR image data. Indeed, our output label was a cumulative reward for a given action relative to an environment's state, so you may now see that we have dynamically created output labels. But instead of them being an end goal for our network, these labels help a virtual agent make intelligent decisions within a discrete space ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access