Proposition The convergence time to be close to the risk-sensitive payoff with error tolerance η is at most

max(Gj)1(log[ error(0)η])j

where (Gj)-1 is the inverse of the mapping tGj(t):=0tgj(t)dt and errorj(0):=||r^j,0E1μj(eμjU˜j1)||, error = maxj errorj. In particular for gj = 1 (almost active case), the convergence time is of order of log[ error(0)η].

Proof. We verify that the solution of the ODE is ||r^j,trsE1μj(eμjU˜j1)||=errorje0tgj(s)ds. From the assumptions, the primitive function Gj is a bijection and r^j,trsE1μj(eμjU˜j1)ηiftmaxj(Gj)1(log[ error(0)η]). The last assertion is obtained for gj = 1. This completes the proof.    Explicit Solutions

As promised in our introduction of this Chapter, ...

Get Distributed Strategic Learning for Wireless Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.