Summary
In this chapter, we took a break from reinforcement learning algorithms and explored a new type of learning called imitation learning. The novelty of this new paradigm lies in the way in which the learning takes place; that is, the resulting policy imitates the behavior of an expert. This paradigm differentiates from reinforcement learning in the absence of a reward signal and in its ability to leverage the incredible source of information brought by the expert entity.
We saw that the dataset from which the learner learns can be expanded with additional state action pairs to increase the confidence of the learner in new situations. This process is called data aggregation. Moreover, new data could come from the new learned policy and, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access