Skip to Content
Practical Simulations for Machine Learning
book

Practical Simulations for Machine Learning

by Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning
June 2022
Beginner to intermediate
331 pages
7h 15m
English
O'Reilly Media, Inc.
Content preview from Practical Simulations for Machine Learning

Chapter 12. Under the Hood and Beyond

In this chapter, we’re going to touch on some of the approaches we have used throughout the previous chapters on simulation.

We’ve covered the gist: in simulation-based agent learning, an agent undergoes a training process to develop a policy for its behavior. The policy acts as a mapping from previous observations to the actions it took in response and the corresponding rewards it earned for doing so. Training takes place across a large number of episodes during which the cumulative reward should increase as the agent improves at the given task, partially dictated by hyperparameters that control aspects of agent behavior during training—including the algorithm used to produce the behavior model.

Once trained, inference is used to query the trained agent model for the appropriate behavior (actions) in response to given stimuli (observations), but learning has ceased and thus the agent will no longer improve at the given task.

We’ve talked about most of these concepts already:

  • We know about observations, actions, and rewards, and how the mapping between them is used to build up a policy.

  • We know that a training phase occurs over a large number of episodes, and how once this is completed, the agent transitions to inference (only querying the model, not updating it any longer).

  • We know we pass a file of hyperparameters to the mlagents -learn process, but we kind of glossed over that part.

  • We know there are different algorithms to choose ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Training Data for Machine Learning

Training Data for Machine Learning

Anthony Sarkis
Practicing Trustworthy Machine Learning

Practicing Trustworthy Machine Learning

Yada Pruksachatkun, Matthew Mcateer, Subho Majumdar

Publisher Resources

ISBN: 9781492089919Errata Page