January 2020
Intermediate to advanced
432 pages
10h 18m
English
We have already covered a number of ways to manage training performance often caused by low rewards or rewards sparsity. This covered using a technique called behavioural cloning, whereby a human demonstrates a set of actions leading to a reward and those actions are then fed back into the agent as a pre-trained policy. The winning implementation here used a combination of behavioural cloning with pre-trained image classification.
We will continue from where we left off in the last exercise and learn what steps we need to perform in order to pre-train a classifier first:
Read now
Unlock full access