October 2019
Intermediate to advanced
340 pages
8h 39m
English
We can also plot the total reward for every episode in the training phase:
>>> import matplotlib.pyplot as plt>>> plt.plot(total_rewards)>>> plt.xlabel('Episode')>>> plt.ylabel('Reward')>>> plt.show()
This will generate the following plot:

If you have not installed matplotlib, you can do so via the following command:
conda install matplotlib
We can see that the reward for each episode is pretty random, and that there is no trend of improvement as we go through the episodes. This is basically what we expected.
In the plot of reward versus episodes, we can see that there are some episodes in which the reward reaches 200. We can ...
Read now
Unlock full access