In the current version of this algorithm, we first need to bootstrap the memories that we populate in the agent with a previous run by an agent, or perhaps a human. This is really no different than imitation learning or behavioral cloning, except we are using an on-policy agent that we will later use as an off-policy base for our imagination. Before we combine imagination into our agent, we can see how the predicted next state will look compared to what the agent's actual state will be. Let's see how this works by opening up the next example Chapter_14_Imagination.py and go through the following steps:
- This example works by loading the previous saved-state dictionary we generated in the last exercise. Make sure that ...