Skip to Content
Practical Simulations for Machine Learning
book

Practical Simulations for Machine Learning

by Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning
June 2022
Beginner to intermediate
331 pages
7h 15m
English
O'Reilly Media, Inc.
Content preview from Practical Simulations for Machine Learning

Chapter 9. Cooperative Learning

In this chapter, we’re going to take another step forward with our simulations and reinforcement learning, and create a simulation environment in which multiple agents must work together toward a common goal. These sorts of simulations involve cooperative learning, and agents will usually receive their rewards as a group, instead of individually—including agents that might not have contributed to the actions that resulted in the rewards.

In Unity ML-Agents, the preferred training algorithm and approach for cooperative learning is known as Multi-Agent POsthumous Credit Assignment (or MA-POCA, for short). MA-POCA involves the training of a centralized critic or coach for a group of agents. The MA-POCA approach means agents can still learn what they need to do, even though the group is the entity being rewarded.

Tip

In cooperative learning environments, you can still give rewards to individual agents if you want. We’ll briefly touch on this later. You can also use other algorithms, or just PPO like usual, but MA-POCA has specialized features to make cooperative learning better. You could wire together a collection of PPO-trained agents to get a similar result. We don’t recommend it, though.

A Simulation for Cooperation

Let’s build a simulation environment with a collection of agents that need to work together. This environment has a lot of pieces, so take your time, step through slowly, and take notes if you need to.

Our environment will involve ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Training Data for Machine Learning

Training Data for Machine Learning

Anthony Sarkis
Practicing Trustworthy Machine Learning

Practicing Trustworthy Machine Learning

Yada Pruksachatkun, Matthew Mcateer, Subho Majumdar

Publisher Resources

ISBN: 9781492089919Errata Page