Skip to Main Content
Hands-On Reinforcement Learning with Python
book

Hands-On Reinforcement Learning with Python

by Sudharsan Ravichandiran
June 2018
Intermediate to advanced content levelIntermediate to advanced
318 pages
9h 24m
English
Packt Publishing
Content preview from Hands-On Reinforcement Learning with Python

SARSA

State-Action-Reward-State-Action (SARSA) is an on-policy TD control algorithm. Like we did in Q learning, here we also focus on state-action value instead of a state-value pair. In SARSA, we update the Q value based on the following update rule:

In the preceding equation, you may notice that there is no max Q(s',a'), like there was in Q learning. Here it is simply Q(s',a'). We can understand this in detail by performing some steps. The steps involved in SARSA are as follows:

  1. First, we initialize the Q values to some arbitrary values
  2. We select an action by the epsilon-greedy policy () and move from one state to another
  3. We update the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Advanced Deep Learning with Python

Advanced Deep Learning with Python

Ivan Vasilev

Publisher Resources

ISBN: 9781788836524Supplemental Content