Skip to Main Content
Hands-On Reinforcement Learning with Python
book

Hands-On Reinforcement Learning with Python

by Sudharsan Ravichandiran
June 2018
Intermediate to advanced content levelIntermediate to advanced
318 pages
9h 24m
English
Packt Publishing
Content preview from Hands-On Reinforcement Learning with Python

Learning from human preference

Learning from human preference is a major breakthrough in RL. The algorithm was proposed by researchers at OpenAI and DeepMind. The idea behind the algorithm is to make the agent learn according to human feedback. Initially, the agents act randomly and then two video clips of the agent performing an action are given to a human. The human can inspect the video clips and tell the agent which video clip is better, that is, in which video the agent is performing the task better and will lead it to achieving the goal. Once this feedback is given, the agent will try to do the actions preferred by the human and set the reward accordingly. Designing reward functions is one of the major challenges in RL, so having human ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Advanced Deep Learning with Python

Advanced Deep Learning with Python

Ivan Vasilev

Publisher Resources

ISBN: 9781788836524Supplemental Content