5

Learning under Delayed Measurement

Education is what survives when what has been learned has been forgotten, B. F. Skinner

5.1    Introduction

In this chapter, we present robust strategies under the imperfectness of the state information and outdated measurement from the players’ side. We develop fully distributed reinforcement learning schemes under uncertain state and delayed feedbacks. We provide the asymptotic pseudo-trajectories of the delayed schemes. Considering imperfectness in term of payoff measurement from the players’ side, we propose a delayed COmbined fully DIstributed Payoff and Strategy Reinforcement Learning (delayed CODIPAS-RL, [162, 175, 89]) in which each player learns her expected payoff function as well as the associated ...

Get Distributed Strategic Learning for Wireless Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.