February 2020
Intermediate to advanced
328 pages
8h 19m
English
In step 1, we created the cliff walking environment using the makeEnvironment() function from the reinforcelearn library. This environment belongs to the gridworld class. In step 2, we created a customized function to query the cliff walking environment and get the sample observational data. The step() method of the env() function takes an action as the input argument and returns a list with the state, reward, and done as the output. Once the observation sequence data was generated, we used the ReinforcementLearning() function to make the agent learn an optimal policy based on this data in the last step.
Read now
Unlock full access