With the preliminary discussion on policy gradients and Actor-Critic Models finished, we can now discuss alternative deep learning algorithms that readers might find useful. Specifically, we will discuss Q learning, Deep Q Learning, as well as Deep Deterministic Policy Gradients. Once we have covered these, we will be well versed enough to start dealing with more abstract problems that are more domain specific that will teach the user about how to approach reinforcement learning to different tasks.

Q Learning

Q learning ...

