O'Reilly logo

Test-Driven Machine Learning by Justin Bozonier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Multi-armed armed bandit throw down

To compare the two algorithms, we're going to build a distribution that represents the payoffs of each algorithm, and do a quick test to see if the RPMBandit is, in fact, better than the SimpleBandit algorithm.

The following is a simulation harness that I've built to compare the two:

def run_bandit_sim(bandit_algorithm): simulated_experiment = BanditScenario({ 'A': { 'conversion_rate': 1, 'order_average': 35.00 }, 'B':{ 'conversion_rate': 1, 'order_average': 50.00 } }) simple_bandit = bandit_algorithm for visitor_i in range(500): treatment = simple_bandit.choose_treatment() payout = simulated_experiment.next_visitor(treatment) simple_bandit.log_payout(treatment, payout) return sum(simulated_experiment._bandit_payoffs) ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required