Thompson sampling
Thompson sampling is a simple strategy, introduced 80 years ago, that has received renewed attention in recent years. It is wildly used in advertising displays, marketing surveys, and financial analysis. Thompson sampling is also a Bayesian strategy, known as probability matching: The probability of selecting the arm n is the probability that n is the arm with the maximum reward [14:4].
The strategy can be summarized as:
- Assign a uniform distribution for each arm, prior to the selection
- Select arm n with a posterior probability that increases with the probability that n is optimal (probability matching)
Bandit context
So far, we have discussed K-armed bandits that do not maintain a state or context. It is assumed that all the arms ...
Get Scala for Machine Learning - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.