October 2019
Intermediate to advanced
366 pages
12h 4m
English
While, for real applications, the choice of environment is dictated by the task to be learned, for research applications, usually, the choice is dictated by intrinsic features of the environment. In this latter case, the end goal is not to train the agent on a specific task, but to show some task-related capabilities.
For instance, if the goal is to create a multi-agent RL algorithm, the environment should have at least two agents with a means to communicate with one another, regardless of the end task. Instead, to create a lifelong learner (agents that continuously create and learn more difficult tasks using the knowledge acquired in previous easier tasks), the primary quality that the environment should have ...
Read now
Unlock full access