Data preparation

Careful readers may notice that a suffix, v0, follows each game name, and come up with the following questions: What is the meaning of v0? Is it allowable to replace it with v1 or v2? Actually, this suffix has a relationship with the data preprocessing step for the screen images (observations) extracted from the Atari environment.

There are three modes for each game, for example, Breakout, BreakoutDeterministic, and BreakoutNoFrameskip, and each mode has two versions, for example, Breakout-v0 and Breakout-v4. The main difference between the three modes is the value of the frameskip parameter in the Atari environment. This parameter indicates the number of frames (steps) the one action is repeated on. This is called the

Get Python Reinforcement Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.