Summary
In this chapter, we looked at the Hello World of DRL, the DQN algorithm, and applying DL to RL. We first looked at why we need DL in order to tackle more complex continuous observation state environments like CartPole and LunarLander. Then we looked at the more common DL environments you may use for DL and the one we use, PyTorch. From there, we installed PyTorch and set up an example using computational graphs as a low-level neural network. Following that, we built a second example with the PyTorch neural network interface in order to see the difference between a raw computational graph and neural network.
With that knowledge, we then jumped in and explored DQN in detail. We looked at how DQN uses experience replay or a replay buffer ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access