book

Multi-Agent Machine Learning

Name: Multi-Agent Machine Learning
Author: H. M. Schwartz
ISBN: 9781118362082

by H. M. Schwartz

August 2014

Intermediate to advanced

256 pages

6h 48m

English

Wiley

Read now

Unlock full access

Cover
Title
Copyright
Preface
References
Chapter 1: A Brief Review of Supervised Learning
1.1 Least Squares Estimates1.2 Recursive Least Squares1.3 Least Mean Squares1.4 Stochastic ApproximationReferences
Chapter 2: Single-Agent Reinforcement Learning
2.1 Introduction2.2 -Armed Bandit Problem2.3 The Learning Structure2.4 The Value Function2.5 The Optimal Value Functions2.6 Markov Decision Processes2.7 Learning Value Functions2.8 Policy Iteration2.9 Temporal Difference Learning2.10 TD Learning of the State-Action Function2.11 Q-Learning2.12 Eligibility TracesReferences
Chapter 3: Learning in Two-Player Matrix Games
3.1 Matrix Games3.2 Nash Equilibria in Two-Player Matrix Games3.3 Linear Programming in Two-Player Zero-Sum Matrix Games3.4 The Learning Algorithms3.5 Gradient Ascent Algorithm3.6 WoLF-IGA Algorithm3.7 Policy Hill Climbing (PHC)3.8 WoLF-PHC Algorithm3.9 Decentralized Learning in Matrix Games3.10 Learning Automata3.11 Linear Reward–Inaction Algorithm3.12 Linear Reward–Penalty Algorithm3.13 The Lagging Anchor Algorithm3.14 Lagging Anchor AlgorithmReferences
Chapter 4: Learning in Multiplayer Stochastic Games
4.1 Introduction4.2 Multiplayer Stochastic Games4.3 Minimax-Q Algorithm4.3 Minimax-Q Algorithm
4.5 The Simplex Algorithm
4.6 The Lemke–Howson Algorithm4.7 Nash-Q Implementation4.8 Friend-or-Foe Q-Learning4.9 Infinite Gradient Ascent4.10 Policy Hill Climbing4.11 WoLF-PHC Algorithm4.12 Guarding a Territory Problem in a Grid World4.13 Extension of Lagging Anchor Algorithm to Stochastic Games4.14 The Exponential Moving-Average Q-Learning (EMA Q-Learning) Algorithm4.15 Simulation and Results Comparing EMA Q-Learning to Other MethodsReferences
Chapter 5: Differential Games
5.1 Introduction5.2 A Brief Tutorial on Fuzzy Systems5.3 Fuzzy Q-Learning5.4 Fuzzy Actor–Critic Learning5.5 Homicidal Chauffeur Differential Game

5.6 Fuzzy Controller Structure
5.7 Q()-Learning Fuzzy Inference System5.9 Learning in the Evader–Pursuer Game with Two Cars5.6 Fuzzy Controller Structure5.10 Simulation of the Game of Two Cars5.11 Differential Game of Guarding a Territory5.12 Reward Shaping in the Differential Game of Guarding a Territory5.13 Simulation ResultsReferences
Chapter 6: Swarm Intelligence and the Evolution of Personality Traits
6.1 Introduction6.2 The Evolution of Swarm Intelligence6.3 Representation of the Environment6.4 Swarm-Based Robotics in Terms of Personalities6.5 Evolution of Personality Traits6.6 Simulation Framework6.7 A Zero-Sum Game Example6.8 Implementation for Next Sections6.9 Robots Leaving a Room6.10 Tracking a Target6.11 ConclusionReferences
Index
End User License Agreement

Content preview from Multi-Agent Machine Learning

Chapter 2Single-Agent Reinforcement Learning

The objective of this chapter is to introduce the reader to reinforcement learning. A good introductory book on the topic is Reference [1] and we will follow their notation. The goal of reinforcement learning is to maximize a reward. The interesting aspect of reinforcement learning, as well as unsupervised learning methods, is the choice of rewards. In this chapter, we will discuss some of the fundamental ideas in reinforcement learning which we will refer to in the rest of the book. We will start with the simple $c02-math-0001$ -armed bandit problem and then present ideas on the meaning of the “value” function.

2.1 Introduction

Reinforcement learning is learning to map situations to actions so as to maximize a numerical reward [1]. Without knowing which actions to take, the learner must discover which actions yield the most reward by trying them. Actions may affect not only the immediate reward but also the next situation and all subsequent rewards [1]. Different from supervised learning, which is learning from examples provided by a knowledgeable external supervisor, reinforcement learning is used for learning from interaction [1]. Since it is often impractical to obtain examples of desired behavior that are both correct and representative of all the situations, the learner must be able to learn from its own experience [1]. Therefore, the reinforcement ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781118362082Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Multi-Agent Machine Learning

by H. M. Schwartz

Chapter 2Single-Agent Reinforcement Learning

2.1 Introduction

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.