Thompson sampling

Thompson sampling is a simple strategy, introduced 80 years ago, that has received renewed attention in recent years. It is wildly used in advertising displays, marketing surveys, and financial analysis. Thompson sampling is also a Bayesian strategy, known as probability matching: The probability of selecting the arm n is the probability that n is the arm with the maximum reward [14:4].

The strategy can be summarized as:

  • Assign a uniform distribution for each arm, prior to the selection
  • Select arm n with a posterior probability that increases with the probability that n is optimal (probability matching)

Bandit context

So far, we have discussed K-armed bandits that do not maintain a state or context. It is assumed that all the arms ...

Get Scala for Machine Learning - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.