April 2019
Intermediate to advanced
212 pages
5h 34m
English
Similar to Q-learning, SARSA is a model-free RL method that does not explicitly learn the agent's policy function.
The primary difference between SARSA and Q-learning is that SARSA is an on-policy method while Q-learning is an off-policy method. The effective difference between the two algorithms happens in the step where the Q-table is updated. Let's discuss what that means with some examples:

Monte Carlo tree search (MCTS) is a type of model-based RL. We won't be discussing it in detail here, but it's useful to explore further as a contrast to model-free RL algorithms. Briefly, in model-based ...
Read now
Unlock full access