November 2019
Intermediate to advanced
296 pages
7h 52m
English
The environment containing four states can be designed as follows. The environment has a list of states and actions at first and also rewards corresponding to the pair of state and action. The reward is decided deterministically based on the current state and action an agent takes:
class Environment { private states = [0, 1, 2, 3]; private actions = [ [2, 1], [0, 3], [3, 0], [3, 3], // End state ]; // Reward is decided based on the current state and action an agent takes. private rewards = [ [0, 1], [-1, 1], [50, -100], [0, 0], ]; // Other methods...}
The environment, in this case, has several utility methods to provide internal states:
private currentState: number; constructor() { this.currentState = 0; } getCurrentState(): ...Read now
Unlock full access