book

Bandit Algorithms for Website Optimization

Name: Bandit Algorithms for Website Optimization
Author: John Myles White
ISBN: 9781449341336

by John Myles White

December 2012

Intermediate to advanced

88 pages

1h 58m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Bandit Algorithms for Website Optimization
Preface
Finding the Code for This BookDealing with Jargon: A GlossaryConventions Used in This BookUsing Code ExamplesSafari® Books OnlineHow to Contact UsAcknowledgments
1. Two Characters: Exploration and Exploitation
The Scientist and the BusinessmanCynthia the ScientistBob the BusinessmanOscar the Operations ResearcherThe Explore-Exploit Dilemma
2. Why Use Multiarmed Bandit Algorithms?
What Are We Trying to Do?The Business Scientist: Web-Scale A/B Testing
3. The epsilon-Greedy Algorithm
Introducing the epsilon-Greedy AlgorithmDescribing Our Logo-Choosing Problem AbstractlyWhat’s an Arm?What’s a Reward?What’s a Bandit Problem?Implementing the epsilon-Greedy AlgorithmThinking Critically about the epsilon-Greedy Algorithm
4. Debugging Bandit Algorithms
Monte Carlo Simulations Are Like Unit Tests for Bandit AlgorithmsSimulating the Arms of a Bandit ProblemAnalyzing Results from a Monte Carlo StudyApproach 1: Track the Probability of Choosing the Best ArmApproach 2: Track the Average Reward at Each Point in TimeApproach 3: Track the Cumulative Reward at Each Point in TimeExercises
5. The Softmax Algorithm
Introducing the Softmax AlgorithmImplementing the Softmax AlgorithmMeasuring the Performance of the Softmax AlgorithmThe Annealing Softmax AlgorithmExercises
6. UCB – The Upper Confidence Bound Algorithm
Introducing the UCB AlgorithmImplementing UCBComparing Bandit Algorithms Side-by-SideExercises
7. Bandits in the Real World: Complexity and Complications
A/A TestingRunning Concurrent ExperimentsContinuous Experimentation vs. Periodic TestingBad Metrics of SuccessScaling Problems with Good Metrics of SuccessIntelligent Initialization of ValuesRunning Better SimulationsMoving WorldsCorrelated BanditsContextual BanditsImplementing Bandit Algorithms at Scale
8. Conclusion
Learning Life Lessons from Bandit AlgorithmsA Taxonomy of Bandit AlgorithmsLearning More and Other Topics

Colophon
Copyright

Content preview from Bandit Algorithms for Website Optimization

Chapter 8. Conclusion

Learning Life Lessons from Bandit Algorithms

In this book, we’ve presented three algorithms for solving the Multiarmed Bandit Problem:

The epsilon-Greedy Algorithm
The Softmax Algorithm
The UCB Algorithm

In order to really take advantage of these three algorithms, you’ll need to develop a good intuition for how they’ll behave when you deploy them on a live website. Having an intuition about which algorithms will work in practice is important because there is no universal bandit algorithm that will always do the best job of optimizing a website: domain expertise and good judgment will always be necessary.

To help you develop the intuition and judgment you’ll need, we’ve advocated a Monte Carlo simulation framework that lets you see how these algorithms and others will behave in hypothetical worlds. By testing an algorithm in many different hypothetical worlds, you can build an appreciation for the qualitative dynamics that cause a bandit algorithm to succeed in one scenario and to fail in another.

In this last section, we’d like to help you further down that path by highlighting these qualitative patterns explicitly.

We’ll start off with some general life lessons that we think are exemplified by bandit algorithms, but actually apply to any situation you might ever find yourself in. Here are the most salient lessons:

Trade-offs, trade-offs, trade-offs: In the real world, you always have to trade off between gathering data and acting on that data. Pure experimentation ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Building Recommender Systems with Machine Learning and AI

Publisher Resources

ISBN: 9781449341565Errata

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Bandit Algorithms for Website Optimization

by John Myles White

Chapter 8. Conclusion

Learning Life Lessons from Bandit Algorithms

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.