Unsupervised RL
Unsupervised RL is related to the usual unsupervised learning in how both methods don't use any source of supervision. While in unsupervised learning the data isn't labeled, in the reinforced counterpart, the reward is not given. That is, given an action, the environment returns only the next state. Both the reward and the done status are removed.
Unsupervised RL can be helpful in many occurrences, for example, when the annotation of the environment with hand-designed rewards is not scalable, or when an environment can serve multiple tasks. In the latter case, unsupervised learning can be employed so that we can learn about the dynamics of the environment. Methods that are able to learn from unsupervised sources can also be ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access