Skip to Content
How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms
conference

How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms

by Nicolas Kowalski, Axel Antoniotti
February 2020
Intermediate to advanced
38m
English
O'Reilly Media, Inc.
Closed Captioning available in German, English, Spanish, French, Japanese, Korean, Portuguese (Portugal, Brazil), Chinese (Simplified), Chinese (Traditional)

Overview

When you access a web page, bidders such as Criteo must determine in a few dozens of milliseconds if they want to purchase the advertising space on the page. At that moment, a real-time auction takes place, and once you remove all the communication exchange delays, it leaves a handful of milliseconds to compute exactly how much to bid. In the past year, Criteo has put a large amount of effort into reshaping its in-house machine learning stack responsible for making such predictions—in particular, opening it to new technologies such as TensorFlow.

Unfortunately, even for simple logistic regression models and small neural networks, Criteo’s initial TensorFlow implementations saw inference time increase by 100, going from 300 microseconds to 30 milliseconds.

Nicolas Kowalski and Axel Antoniotti outline how Criteo approached this issue, discussing how Criteo profiled its model to understand its bottleneck; why commonly shared solutions such as optimizing TensorFlow build for the target hardware, freezing and cleaning up the model, and using accelerated linear algebra (XLA) ended up being lackluster; and how Criteo rewrote is models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TensorFlow nodes in its model.

Prerequisite knowledge

  • A basic understanding of how TensorFlow and TensorFlow Serving work
  • Experience optimizing TensorFlow models for serving (useful but not required)

What you'll learn

  • Understand how to optimize a TensorFlow model before serving it online
  • Discover how to profile a TensorFlow model with a complex preprocessing architecture
  • Learn how and when to replace feature columns with custom cross-features and hashing functions to factorize and drastically reduce the number of nodes in the model

This session is from the 2019 O'Reilly TensorFlow World Conference in Santa Clara, CA.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Meet the Expert: Dean Wampler on Scaling ML/AI Applications with Ray

Meet the Expert: Dean Wampler on Scaling ML/AI Applications with Ray

Dean Wampler

Publisher Resources

ISBN: 0636920372547