Roboschool
Up until this point, we have worked with discrete control tasks such as the Atari games in Chapter 5, Deep Q-Network, and LunarLander in Chapter 6, Learning Stochastic and PG Optimization. To play these games, only a few discrete actions have to be controlled, that is, approximately two to five actions. As we learned in Chapter 6, Learning Stochastic and PG Optimization, policy gradient algorithms can be easily adapted to continuous actions. To show these properties, we'll deploy the next few policy gradient algorithms in a new set of environments called Roboschool, in which the goal is to control a robot in different situations. Roboschool has been developed by OpenAI and uses the famous OpenAI Gym interface that we used in the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access