Skip to Content
Search logs + machine learning = autotagged inventory
conference

Search logs + machine learning = autotagged inventory

by John Berryman
February 2020
Intermediate
37m
English
O'Reilly Media, Inc.
Closed Captioning available in German, English, Spanish, French, Japanese, Korean, Portuguese (Portugal, Brazil), Chinese (Simplified), Chinese (Traditional)

Overview

For ecommerce applications, matching users with the items they want is the name of the game. If they can’t find what they want, then how can they buy anything? Typically, this functionality is provided through the search and browse experience. Search allows users to type in text and match against the text of the items in the inventory. Browse allows users to select filters and slice and dice the inventory down to the subset they’re interested in. But with the shift toward mobile devices, no one wants to type anymore—thus browse is becoming dominant in the ecommerce experience.

But there’s a problem if your inventory isn’t categorized. Perhaps your inventory is user generated or generated by external providers who don’t tag and categorize the inventory. No categories and no tags means no browse experience and missed sales. You could hire an army of taxonomists and curators to tag items, but training and curation will be expensive. You can demand that your providers tag their items and adhere to your taxonomy—but providers will buck this new requirement unless they see obvious and immediate benefit. Worse, providers might use tags to game the system—artificially placing themselves in the wrong category to drive more sales. Worst of all, creating the right taxonomy is hard. You have to structure a taxonomy to realistically represent how your customers think about the inventory.

Eventbrite is investigating a tantalizing alternative: using a combination of customer interactions and machine learning to automatically tag and categorize its inventory. As customers interact with the platform—as they search for events and click on and purchase events that interest them—Eventbrite implicitly gathers information about how its users think about its inventory. Search text effectively acts like a tag, and a click on an event card is a vote that the clicked event is representative of that tag. Eventbrite uses this stream of information as training data for a machine learning classification model, and as Eventbrite receives new inventory, it can automatically tag it with the text that customers will likely use when searching for it. This makes it possible to better understand the inventory, supply and demand, and most importantly this allows Eventbrite to build the browse experience that customers demand.

John Berryman takes a deep dive into the problem space and Eventbrite’s approach. He explores how the company gathered training data from its search and click logs, and how it built and refined the model. You’ll see the output of the model and both the positive results of Eventbrite’s work, as well as the work left to be done. You’ll leave with some new ideas to take back to your business.

Prerequisite knowledge

  • A basic understanding of machine learning involving text manipulation, classification algorithms, and neural networks

What you'll learn

  • Gain a clever technique for generating tags for products based on the search behavior of customers

This session is from the 2019 O'Reilly Strata Conference in New York, NY.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Spotlight on Data: Caching Big Data for Machine Learning at Uber with Zhenxiao Luo

Spotlight on Data: Caching Big Data for Machine Learning at Uber with Zhenxiao Luo

Zhenxiao Luo

Publisher Resources

ISBN: 0636920372349